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SECTION A 


SUMMARY REPORT 



This section provides a brief sununary report of the work accomplished 
under the "Study of Coonaunications Data Compression Methods", under NASA 
contract NAS 2 -^ 703 * The results are fully explained in subsequent sections 
on video compression, Landsat image processing, and satellite communications. 

The first task of contract NAS 2-9703 was to extend a simple monochrome 
conditional replenishment system to hi^er compression and to hi^er motion 
levels, by incorporating spatially adaptive quantizers and field repeating. 
Conditional replenishment combines intraframe and interframe compression, and 
both areas were to be investigated. The gain of conditional replenishment 
depends on the fraction of the image changing, since only changed parts of 
the image need to be transmitted. If the transmission rate is set so that 
only one-fourth of the image can be transmitted in each field, greater change 
fractions will overload the system. 

To accomplish task I, a computer simulation was prepared which incorporated 
l) field repeat of changes, 2) a variable change threshold, 3) frame repeat for 
hi^ change, and U) two mode, variable rate Hadamard intraframe quantizers. The 
field repeat gives 2:1 compression in moving areas without noticeable degradation. 
Variable change threshold allcws some flexibility in dealing with varying change 
rates, but the threshold variation must be limited for acceptable performance. 

2:1 frame repeat provides 15 frames per second, and is acceptable, but h;i repeat 
is objectionably degraded. The two mode Hadamard quantizers use a 2 bit per 
picture element (bpp) edge subpicture quantizer and a 1 bpp flat subpicture 
quantizer. For the Reasoner image, the two mode variable rate performance at 
1.25 bpp is compsirible to single mode performance at 2 bpp, but the gains depend 
on the fraction of changed blocks using each mode. 

The goal of task I was to achieve performance free of artifacts at l/U bpp. 
The simulated system achieves this, for scenes with less than 50 percent motion. 
The different compression ratios for changed parts of the image are 2:1 for 
conditional replenishment, 2:1 for field repeat, 2:1 for frame repeat, and 4:1 
for intraframe Hadamard compression (all edge mode). The total combined comp- 
ression ratio is 32:1, which reduces the original 3 bpp to 1/4 bpp. If the 
fraction changing exceeds 50 percent, additional frame repeats occur, with a 
limit of 4:1 for 100 percent change. 
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Task II of the statement of work was to extend conditional replenishment 
to color video. This was accomplished by modifying the monochrome simulation 
program to process both monochrome and color. The I and Q color signals are 
quantized using l/k bpp in the edge mode, and l/8 bpp in the flat mode, so that 
color transmission requires 2^ x>^rcent hi^er transmission rate than monochrome. 
Changes and modes are detected using only the I or monochrome signal, which 
introduces some artifacts. 

Certain related aspects of video conq>ression were investigated in conjunc- 
tion with these tasks. Frame interlace is a compression method which has some 
aspects of intraframe compression, and some aspects of interframe compression. 

Using previously available data, a report was prepared to compare the hardware 
requirements and performance of field and frame compression. It was concluded 
that the frame interlace gains about l/2 bpp in compression, at a cost of 4 
or 7 bit- frames of memory, depending on implementation. 

D. Hein and N. Ahmed, under a grant from MASA-ARC, recently developed a 
method for deriving the discrete cosine transform from the Hadamard transform. 

This method is being implemented at NASA-ARC. In cooperation with D. Hein, 
transforms intermediate between the discrete cosine and Hadamard transform in 
complexity and performance were devised and investigated. Theoretical and 
experimental results shew that transforms with simplified implementation can 
provide i>art of the gains of the discrete cosine transform over the Hadamard. 

Monochrome and color conditional replenishment using several frame repeats 
was processed using frame averaging. Repeated frames were replaced by composite 
frames, formed by averaging the two transmitted frames closest in sequence. Frame 
averaging makes a small improvement in material using three frame repeats, or 10 
frames per second, but introduces strange blurring and shape changing in material 
using five repeats, or 6 frames per second. 

Task III of the contract was to investigate image compression for the trans- 
mission of Landsat images, with emphasis on using spectral correlation. The pro- 
posal noted that cluster coding was a sucessful compression method using spectral 
correlation, but had high complexity. A new method, "picture element replication", 
was simulated. A table of previously transmitted elements is maintained at the 
transmitter and receiver . IVhen the image element to be transmitted is within 




a si&aU error distance of an element In the table y the location of the table 
element is transmitted » instead of the element value. The table element 

replaces the original element in the image. When no table •element is sufficloitly 
similar to the element to be transmitted, the exact element value is transmitted, 
and inserted in the table. Transmission rate is reduced because the number of 
table elements is much smaller the number of all possible elements. 

Picture element replication was analized theoretically. The expected error 
is a knovn fracticm of the error bound used to select table elements. If the 
table size is sufficiently large, the transmission rate is only one or two bpp 
more required to transmit a table entry. Increasing table size reduces 
error and increases rate in a known w**y. Additional compression can be gained 
by using a one bit word for a replicated element. 

Picture element replication was simulated, and tested with two images 
used in other work done for NASA-ARC. The new method was found superior to 
the previous spatial compression methods, and comparable to cluster coding. 

Picture element replication is much simpler in implementation than cluster 
coding, and is suitable for the transmission of stored data via satellite to 
a remote site. 

In conjuncticn with this task, a survey of Landsat classification methods 
was made. There are several potentially useful techniques reported in the 
literature. Methods of nonparametric classification, including nearest neighbor, 
can reduce the human int I'vention required by normal distribution, maximum 
liklihood classifiers. Spatial-spectral classifiers can produce classified 
Images similar to those produced by himan image interpreters. 

Task IV of the statement of work was to investigate signal processing, 
channel geometry, and demand assignment for satellite communication. The 
proposal stated the intention to review operational systems, and to evaluate 
a technique to improve the bandwidth efficiency of satellite conmunications , 

The technique selected for study, which is mentioned in the recent literature, 
is four dimensional signal design. The report on four dimensional bandwidth 
efficient modulation reviews the advantages of hi^er dimensional signal design, 
compared to conventional multi-level amplitude phase shift keying. The potential 
gain of four dimensional signal design is estimated using the capacity bound. 
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Two new classes of sl^aal designs are Introduced, which cointer the disadvantages 
of amplitude phase shift Keying. One class of designs is based on the densest 
four dimensional lattice, and is conjectured to approach the optimum desigas. 

At transmission rates needed for bandwidth efficient modulation, 2 to 3 bits 
per dimension, the new designs increase rate 0.6 bits per dimension at fixed 
signal to noise, or allow 2 to U dB signal reduction at fixed rate. The four 
dimensional receivers are not much more complex than conventional receivers. 

In ■tile review of systems, the TRW "Mobile ^blltiple Access Study" was 
examined. The study observed that continuously variable slope delta modulation 
was nearly competitive with frequency modulation (FM) . Several other papers 
on satellite communication, and several pother budgets, were reviewed. It seems 
generally accepted that frequency division multiple access (FDMA) provides more 
channels in a given bandwidth them time division or code division. Demand assign- 
ment of channels by a central station can achieve 100 percent useage, while 
contention systems are limited to lower efficiency. Of special interest were 
two Comsat papers. Campanella and others concluded that digital voice modulation 
and FM have similar performance for single channel per carrier voice channels. 
Welti and Kwan also found digital competitive, and considered some four dimen- 
sional methods in their comparison. 

Some results of this study will be published. The operation of an earlier 
conditional replenishment system was described in "two papers given at the August 
1977 SPIE convention and at the december 1977 rJTC meeting. The study of frame 
and field compression, appendix A of section B belcw, will be presented at SPIE 
in August 1977. The investigation of transforms derived via the Hadamard trans- 
form, appendix B of section B belcw, has been accepted for the ITC meeting in 
November 1978. The report on Landsat element replication compression, section C 
below, will be published in the IEEE Transactions on Communications if certain 
required revisions are made. 
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This report describes the newly developed conditicHial replenishment system. 

The conditional replenishment simulation program has been modified to have the 
following features: 

1) forced field repeat of changed subpictures 

2) a range of thresholds differing by one for all vectors tested; the 
threshold used is the lowest that does not give excessive changes 

3) frame repeat, when the number of changes is excessive and the change 
detection threshold is at its upper limit 

4) two mode replacement of changed Hadamard subpictures, using optimized 
flat and edge quantizers with fixed rate 

5) a forced refresh option, in addition to refresh using any rate not 
needed for update of changed subpictures. 

The forced field repeat of changes, with rate buffering over a frame, reduces 
the transmission rate required for changes by 2:1. This causes no apparent loss 
of motion rendition quality, but there is a reduction in vertical resolution which 
is acceptable in moving areas. Selecting the change detection threshold based on 
the number of changes for the current frame, rather than for past numbers of changes, 
eliminates threshold caused high change problems and is needed to allcw the use of 
optimized fixed rate quantizers. Frame repeat is sometimes better during high changes 
than replacing only the larger changes by increasing the change detection threshold, 
which causes a panned scene to break up into subpictures. Using optimized quantizers 
and a two mode intraframe compression reduces the average rate required to replace 
a changed subpicture, as further discussed below. The refresh is used both to 
repair transmission errors and to bring unchanging subpictures to the full 3 bpp 
resolution in both video fields. 

The compression ratios of the various techniques used in replacing changed 
subpictures are shown in Table I, for operation at an average rate of i/l6 bpp over 
a frame. The average rate required to replace a changed subpicture using two mode 
compression is estimated to be 1 1/2 bpp. With forced field repeat, the configuration 
can operate at fractions of change up to l/4 without frame repeat. In the case of 
a complete change, the frame rate drops to 1/4, or 7«5 frames per second. P’orced 
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TaLle I: Compression techniques and ratios for change replacement. 


Method Compression ratios 


1. Intraframe ) two mode 

8:1 1/2 (or 

5 1/3:1) 


2. Conditional replenishment 

4:1 

2:1 

4/3:1 

1:1 

3. Forced field repeat 

2:1 




U. Frame repeat, as needed 

1:1 

2:1 

3:1 

4:1 

Total compression ratio 

42 2/3 

:1 



Acceptable change fraction <l/U 

<1/2 

<3/4 

<1 

Frame rate, per second 

30 

15 

10 

7 1/2 
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refresh could require another l/l£ hpp, for conq>lete replaceoient of both fields 
In 48 frames or 1 1/2 seconds. The change Indicator overhead Is 1 bit for each 
subpicture, or 1/64 bpp. The flat/edge mode Indicator overhead Is another 1 bit 
for each changed subpicture, or l/64 bpp which can be Included In the 1 l/2 bpp 
for changed subpictures. The total average rate Is 3/l6 + l/l6 + l/64, or l?/64 
bpp, for monochrome. 

A block diagram of the simulated system Is given In figure 1. All memories 
are at 3 bpp, and use the organization of the 3 bpp 3J.st quantizations. The 3 bpp 
quantization is obtained by rounding and truncating the transform coefficients, 
without using read-only memories. The receiver memories often actually contain 
lower quality descriptions of the subpictures, because of the use of lower rate 
quemtizatlons based on read-only memories for changed subpicture replacement. The 
system can probably be implemented with a two frame delay, exclusive of repeats. 

During the first frame time, the first field of the frame is transform*- ,i truncated, 
then change detected and flat/edge mode detected, and the change info. on placed 
in the transmitter buffer. During the second frame time, the change and refresh 
Information is transmitted to the receiver buffer. During the third frame time, 
the receiver buffer is used to update the receiver memory and to form the display, 
in conjunction with the memory. In the repeat mode, the disp. y ' Termed 

entirely from memory, and the chang* information data is stored ir-. t n -’epeat 
memory until the new frame is complete. 

We further consider the optimized, fixed rate quantizations used to replace 
changed subpictures. The previous system used a hi^ quality, 3 bpp, quantization 
for the transformed vector memory, as does the revised system. The quantization 
is divided into twenty- four lists of vectors, each list at 1/8 bpp, for use in 
variable rate change replacement, memory completion, and refresh. While the revised 
system retains the 3 bpp quantization and lists for memory organization and refresh, 
the changed subpictures are replaced using two-mode, optimized, fijced rate quantizers, 
and there is no metr. ”v' completion, other than that provided by the refresh. 

The purpose of two mode, optimized, fixed rate quantizers is to reduce the rate 
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Figure 1: Conditional replenishment system block diagram. 
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required to replace a changed subpicture, while retaining the required quality. 

Figure 2 shows the nean square error for various intra frame quantizers, for the 
B e ason er image processed in the field. The previous system attempted to use at 
least sixteen lists, or 2 bpp, to replace changed subpictures. The earlier and 
i^rored fixed rate, single mode quantizers for 1 l/2 bpp have about the same 
mean square error as the list quantizer at 2 bpp. By using fixed rate, single 
■ode quantizers, a gain of l/2 bpp can be made. The penalty for using quantizers 
whidt are not based on the meiaory quantization lists, is that completing changed 
subpictures becomes much more cooqilicated. This memory completion can be dispensed 
with, if the initial replacement quality is reasonably hi^, and if the refresh 
rate is sufficient to supply full quality in a reasonable time. 

A further reduction in the rate to replace a changed subpicture can be made 
using two mode, variable rate intraframe compression. A variable rate for replaced 
subpictures can easily be used, because a conditional replenishment system allocates 
rate over a frame. Mean square results fur three two-mode, variable rate methods 
are alsc given in figure 2. Two mode, variable rate quantization provides signif- 
icant adiitionai rate reduction. Ihe data point at about 1.1 bpp uses a 1 l/2 bpp 
edge mode and * 1 bpp flat mode. The points at about I l/u bpp use a 2 bpp edge mode 
with the sace 1 bpp flat mode, and have two different change detection thresholds. 

The average rate required to replace changed subpictures depends on the amount 
of detail in soring or changing areas. 

Optittising a two code, v'ariable rate system is an interesting problem, .‘ssuae 
that we first design an acceptable some fixed rate. Because large, 

noticeable errors usually occur at edges, a good subjective quantization is usually 
optimised for edges. !-5ary subpictures are flat and featureless, and could be trans- 
mitted usir.g fever tits. A flat subpicture quantization voula have narrow vector 
ranges and ssall quant izat ice steps. In a two mode system , the original acceptable 
qxuiatisatiom Is used for edges, and whenever a vector extends beyond the flat quant- 
ixaticc's range. The best two rode design gives the largest rate reduction from the 
crigizal sLagle mode ieuign. The rate reduction equals the fraction of subpictures 
using the flat mode, tires the difference between edge and flat mode rates. As the 
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rate of the flat mode is Increased, its rate reduction is reduced, but its vector 
ranges increase to cover a larger fraction of the subpictures.. Because of these 
coiqiensating factors, the actual choice is not too critical. A 3 A ■>‘^6 

vas used for half the subpictu'- ’s , for a rate reduction of l/2(2-3A) = 5/8* A 
1 bpp flat design handled 3A subpictures, for a reduction of 3A(2“l) = 3A» 

nie 1 bpp flat design also had superior visual quality, and gave a more intuitively 
acceptable measure of the portion of flat area in the Reas oner image. 

The DCT and two new transforms for intraframe transform compression are described 
in Appendix B. Figure 8 of Appendix B gives single mode results, also for the 
Reasoner image, for the discrete cosine and 6 and C matrix transforms using the 
experimentally based quantizers designed for the Hadamard transform. Some of the 
results are compared in Table 2 bel<w, for a mean>square error of 1.0. 


TABLE 2 


METHOD 

RATE FOR ^BE =1.0 

Hadamard, conditional rep lists 

2.0 

" single mode 

1.6 

" improved single mode 

1.5 

Cosine, single mode, hadamard design 

l.U 

Hadamard, two mode, fixed rate 

l.U 

" " variable rate 

1.15 


The total range of transmission rates is 2.0:1.15 or 1.7^:1. If the conditional 
replenishment lists, which are now used only for memory update, are eliminated, 
the range of rates is 1 . 6 : 1.15 or 1.3951* The cosine gains 0.2 bits per pel, even 
using a Hadamard quantizer design, and another 0.1 or 0.2 should be obtainable by 
a design for the cosine. 

In the monochrome conditional replenishment test runs , the Hadamard transform 
was used with 3 bpp memory quantization, 2 bpp edge mode, and 1 bpp flat mode 
quantization. A tabulation of all the simulations is given in Table 3» Appendix D. 
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Conditional replenishment test runs have been made at .l/U, l/8, and l/l6 bits 
per sample, althou^ oniy a few have been viewed on the Echo Science disc. 3/^-6 bits 
per pel seems a reasonable rate, as (xitilned in table I above, for monochrome 
conditional replenishment. The intraframe rate is tahen as 1.^ bpp, for a flat (1 bpp) 
and an edge (2 bpp) mode used equally. As the fraction changing increases beyond 
l/U, frame repeat is introduced. The same table can be used to show how operation 
at l/l6 bpp is possible. Using 5 l/3:l for intraframe, 4:1 for conditional replenish- 
ment, 2:1 for field repeat, a frame repeat of 4:1 is required. This gives 7 l/2 
frames per second, noticably degraded motion. The overall compression ratio is 
120 : 1 . 

Color intraframe conqiression was examined as part of an earlier contract, and 
a report was made In the winter of I976-7. It was not:ed that the chrominace signals, 
IQ or R-Y, B-Y, are usually filtered to 0.5 MHz, while the luminance or Y component 
has a bandwidth of 4.5 MHz. This implies that the horizontal sampling rate can be 
reduced 9:1* H suitable x>re and post filtering is used. The report suggested that 
one I or Q color sample be transmitted in each 4 by 4 sampla block using 8 bits, for 
an average color rate of O.5 bpp. Published work in intraframe transform conpression 
using the I and Q signals indicated that the color components could be transodtted 
at less thna 1 bpp. Pratt used 0.75 bpp for both color components, and Chen used 
0.53 bpp for I and 0.29 bpp for Q. 

Experimental work in color compression at ARC was reported by Jones on Aug. 15, 

1977 • Hein, working in the frame, combined transformed but uncompressed Y (8 bpp) 

with transformed, compressed I and Q. The I and Q sigr-.is had the Hadamard H„ , 

8 8 

H^, » H and H represented using 8 bits q q . Color performance was 
01 10 11 00 

acceptable. Jones repeated the above quantization in the field, and also used 

2 bpp Y compression similar to that described in the monochrome intraframe section 

above. Color performance was acceptable, but since an 8 by 8 block was used, the 

/ 8 4 

total color rate was 1 bpp. The rate was reduced to 1/2 bpp using , . bits for the 

80 ‘+0 

I and Q vectors. Reduction to a total of 1/4 bpp using ^ ^ for I and Q, but the 
smear of color at edges was unacceptable. The results of this work were gratifying, 
because they provide a rare instance where exi>eriment confirms theoretical prediction. 
Hofever, it should be noted that only a few images were tested, and only Hadamard, 
single mode quantizations were used. 
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PAGES 

Investigation of color conditional r^^enllAUSIX was perfornied as part of 
task II. D format tapes. In RGB, were converted to E, YIQ. Modifications were made 
to the monochrome conditional replenishment so that color could be run. This 
first color program made two important choices : 

1) changes in time are detected using Y only, and I -and Q are updated when Y 
is updated. 

2) the I and Q signals, like the Y, have two update loodes, flat and edge. 

If these choices are acceptable, the complexity of a color conditional replenishment 
system is reduced. These choices are in accordance with Limb's idea (Plateau Coding) 
that color usually changes only ^en luminance changes. 

The two color files available are Wheel of Fortune ( 59 frames ) , and Water Skiers 
(U6 frames). These have been run at 1.0, 0.5, and O.25total average bpp, using 
the intraframe bits given in table 3 below. 

TABLE 3 
RATE, bpp 



Memory 

Flat 

Edge 

Y 

3 

1 

2 

I 

1/4 

1/8 

1/4 

Q 

1/4 

1/8 

1/4 

Total 

3 1/2 

. 1 1/4 

2 1/2 


The 1/4 bpp quantization for I and Q is ^ o, the 1/8 bpp quantization is q o * 

Wheel and Skiers have average total change percentages of 50^ and 65'io at 0.5 bpp, 
where the threshold is high, and average change percentages of and at 1 bpp, 
where th ■' change threshold is low. These changes are the highest seen so far, and 
are p • obably not typical of teleconference material. The changes at 0.5 bpp arc 
net.iy all edge mode in Wheel, but are 40^j-50% flat in Skiers, so that the flat mode 
gives useful rate reduction. Table 3, above, shows that the transmission of color 
changes requires a 25 percent hi^er rate than monochrome changes . 
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Beesuse of the adaptive change detection threshold, the nundser of changes 

is a function of the fixed average transmission rate. Hcwever, for lor rates 

the threshold reaches the limiting value, and the minimum numbhr of changes is 
detected. The fixed rate is achieved by frame repeats, and the number of 

changes fluctuates. This occur ed for Skiers at the rate of 0.25 bpp. Table U 

gives the typical change rates for these files con^nressed at average rates of 

1.0, 0.5, and 0.25 hpp. 

0.25 
50 % 

65-75 % 

These change rates are very large, and not typical of teleconference material, 
but they provide an interesting worst case test. 

At the hipest rate, 1.0 bpp, the adaptive threshold drops to the minimum, 
increasing the number of changes. At 0,5 and 0.25 bpp, the change detector 

increases to the maximum, least sensitive setting. The bit rates and modes for 

the y, I, and Q signals were indicated in the previous report. The number of 
repeats and the resultant frame rates are given in table 5« 

Table 5 Number of repeats /frames per second 


Average rate 

1.0 

0.5 

0.25 

Wheel 

0/30 

1/15 

2/10 

Skiers 

0/30 

1/15 

2-3/10-7.5 


The number of repeats and the frame rates are determined by the average rate 
allowed and the rate needed to describe the detected changes, as computed from the 
number and mode of the changes. The number of repeats is much hi^er than for any 
of the monochrome runs at the same average rate, largelybecause of the higher changes 
in the two color test sequences. 


Table ^ Typical change percentages 


Average rate 

1.0 

0.5 

Wheel 

75 % 

50 % 

Skiers 

9» % 

65 % 
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The compressed test runs of Wheel -and Skiers made at aod 0.2^ bpp were 
converted to RGB format and placed on the Echo Sciences disk. In general, the 
results were good, and color artifacts due to two mode Intraframe and conditional 
replenishment compression were not detectable In real time. 

Several different sorts of artifacts were found, and these are described below. 
Color quality. The colors In general were pale. This Is also true of the original, 
and Is due to the difficulty of synchronizing the Echo disk when saturated colors 
are recorded. In Wheel, the girl's face turns slightly greenish when she moves 
Into the Shadow of the wheel. This also occurs In the original. There Is no 
reason to suspect that the compression methods affect color quality* 

Color modes. The flat and edge modes were selected using only the Y component. 

This produced a blurring of the purple stripe arround the door in Wheel, where 
Y levels are similar. The refresh improved the definition rapidly at 0.5 bpp. 

The edge/flat mode detection can also depend on I and Q. 

Color change detection. In the Skiers sequence, parts of a red-orange reflection 
are momentarily rendered as blue, the color of the object formerly occupying the 
same image area. Both colors have similar Y level, and the change detection error 
is not apparent in real time. The change detection can also depend on I and Q. 
Moving edges. In Wheel, the more noticeably moving part of the girl's jeans has 
a busy edge. This Is a familiar conditional replenishment artifact for mo/^lng hi^ 
contrast edges. 

Field repeat. In current simulations, all moving areas are shorn using field 
repeat. This causes a resolution loss most noticeable in the Skiers tor ropes, 
which are jagged. This is typical of field repeat, and would be removed by the 
refresh/update in static areas. 

Motion artifacts. There is apparent flicker in the real-time presentation of the 
conditional replenishment material. Some of this is due to the imbalance between 
sucessive frames, which use different heads. This can be shown when the same frame 
is rerecorded, as was done at 0.25 bpp. Some of the flicker is due to hum bars 
recorded in random vertical positions, causing brightness changes at the frame 
rate. Finally, there is some jerkiness due to the frame repeats, which are more 
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apparent at 0,23 tpp. 

Other artifacts. In Skiers, some of the lettering on the ran^ is strangely 
interlaced, due to a film interlace problem in the original. Also in Skiers, 
there are some white blocks which appear to anticipate the motion of the ramp, 
probably due to the same cause. 

In summary, the color artifacts observed in this material are not objectionable 
in real time, and are either present in the original scene or can be removed by 
testing I and Q for changes and modes. 
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COMPARISON OF VIDEO FRAMES AND FIELDS 
FOR TRANSFORM C(»IPRESSION 
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IHTRODUCTION 


!Hils paper describes the results achieved, and ^e hardirare required, for 
video compression using either fields or interlaced frames. Interlacing the 
video fields, and the inverse operation, requires substantial digital memory, 
but achieves a given compressed image quality using a lover transmission rate. 

In television transmission, the scene is repeatedly scanned to form a field 
image of about 2^6 lanes. As shovn in figure 1, each field consists of every 
second line in the full frame, and the alternating fields (represented by solid 
or dashed lines) are displaced vertically by one line. Two sucessive fields 
form the full video frame of aboit 512 lines. Fields are transmitted at the 
rate of sixty times per second, to avoid the objectionable flicker effect which 
occurs at lower rates, even thou^ thirty or fewer itsages per second are suffic- 
ient for motion representation (1). 

It is possible to transmit sampled images at reduced bit rates because much 
of the information in samples taken at the r^quist rate is redundant. The sucess- 
ive samples are not independant, and the video process can be described by a first 
order Markov model (2) . This Markov model fits the measured correlation of the 
four test images used here, as shown in figure 2 for the the image of newscaster 
Harry Reasoner, and in table I for all four test images. The image frames usually have 
the highest correlation between adjacent samples in adjacent lines in a frame 
(different fields), the next highest correlation between adjacent samples in the 
same line, and the lowest correlation is between corresponding samples in the clos- 
est lines in a field (separated by an alternate field line). This is as expected, 
from the U to 3 width to hei^t aspect ratio of the video frame, and from the fact 
that the four test frames have low motion or change between fields. 

Because higher correlation allows more transmission rate compression, the 
correlation values indicate that it is most effective to use samples in the 
adjacent lines of alternate fields, next in effectiveness to use samples in the 
current line, and least effective to use samples in the closest lines of the 
same field. The cost of using these samples is the memory required to store them. 
Using samples in the same line requires a few samples to a line of samples to be 
stored; using the closest lines in the same field requires several lines of 
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Figure l! Tbe interlaced videQ frame, with the first field 
given by solid lines and the second field given by dashed lines 
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Figure 2 . Picture Element Correlation versus Distance for the Reasoner Image. 
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Ra^i 


;e of the correlation parameter R for D equal tof 1 throu^ 7, 


where R « C , C is measured correlation and D is sample distance. 


Picture 

In-line 

Reas oner 

.966-. 987 

Two girls 

.965-. 982 

Two men 

.96U..977 

Band 

.8U7-.882 

Assumed 
in design 

.95 


Between lines 

Betweea lines 

in field 

in frame 

.967-. 973 

.984-. 989 

.9**6-.950 

.972-. 978 

•933-.9‘»6 

.968-. 972 

00 

• 

CO 

• 

.888-. 916 

.94 

.97 
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meittoxy; and using samples in the adjacent lines of the alternate field requires 
a full field of memory. Obtaining the lowest possible trao'smission rate car 
require much more memory than less efficient systems. This effect is also 
apparent in conditional replenishment systems, which use tue correlation between 
sucessive frames in time. (3) (4) 

TRANSFORM COMPRESSION SYSTEM 


CoEoputer simulations of video image compression systems were undertaken 
to compare the performance of field and frame compression. All experiments 
involved single image compression of monochrome television images. Digitized 
images were obtained by sampling a standard NTSC baseband signal at 8.064 
megasamples per second. Each sample was represented by a six bit integer. 

The visible area of the images has Ul6 samples per line and 464 lines per fraiue. 
The nominal ^12 samples and ^12 lines includes samples in the horizontal retrace 
and lines in the vertical interval. The television images were compressed both 
as fields of 232 lines and as interlaced frames of 464 lines. 

The compression experiments used Hadamard transforms of eight by ei^t 
subpictures. The coefficients of the sixty-four Hadamard vectors are used to 
represent the subpictures. The ei^t by eight Hadamard basis vectors are shown 
in sequency order in figure 3* Sequency is defined here as the tota' number of 
white-black and black-white transitions, in the horizontal or vertical directions. 
If the Hadamard transform is not normalized, the vector coefficients have a 
possible range of twelve bits, since each is the rec It of 64 additions or 
subtractions of six bit numbers. The vector coefficients were first rounded 
to the ei^t most significant bits, and then quantized to an eigiit bit inte,j'‘r. 
Transmission rate compression is achieved by using fewer than 64 quantizer 
levels, and indicating ich using a code word shorter than six bits. In the 
final compressed picture, each sample is represented using a six bit integer, as 
in the original image. 

Figure 4 shews the hardware organization of the independant f iel . transform 
compressor. The input lines are converted to digital samples, stored in aa eight 
line memorj', transformed in eight sample by eight line blocks, and quantized. 
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The quantized hit stream is transmitted, and the inverse process is used to 
generate analog video. ISie field compressor uses the correlation hetveen the 
saoqtles in a line and between the lines in a field. Some correlations are net 
used, since each eight by eight subpicture is processed independently. 

Figure ^ shaws the interlaced frame transform compressor, which performs the 
same functions as the field x>rocessor. It differs because ei^t by ei^t sub- 
pictures are taken frcMn an interlaced frame, rather tba.i one field. The ei^t by 
ei^t subpictures have caie-half the hei^t of field subpictur^'s . In order to 
interlace a frame, the first field is held in memory until tie second field is 
being generated. The fields are then interlaced and the subpictures are transformed. 
Information on the two fields is partly transmitted and partly stored during the 
second field time, and the stored information is transmitted during the next field 
tUie. The receiver output display is not synchronized to the data transmission, 
as it was in the field compressor. To provide the correct display, the receiver 
requires a compressed memory to hold the frame in the transmitted form. This 
memory is decoded twice, to provide the two fields. The memory required is one 
field at ei^t bits and one-half of a compressed frame (assumed to require 1 bit 
per sample) at the encoder, and one compressed frame at the decoder (assumed to 
require 2 bits per sample), or the equivalent of one frame at 7 bits (i.e., 7 bit 
frames). This is the cost of using the correlation between adjacent lines in a 
frame, rather than the correlation between the closest lines in a field. 

SYSTE.4 SBTJLATION RESULTS 


Figure 6 shews the mean-square error results obtained (in units of the least 
significant of the six original bits) when the Harry Reasener test image was 
compressed using theoretical compressicr. designs. The different compression 
designs consist of the bit assignments and quantizers for each of the 64 Hadamard 
vector coefficients. The theoretical designs assumed the xirst order Markov 
correlation model (with the "assumed in design" values of table I) , an exponential 
distribution for the transform vector coefficients, and the mean-square error 
measure. At the same rates, field compression produces larger error than frame 
compression, or, equivalently, field compression requires more rate for a given 
error. Hewever, there are two cases in the field data, and one case in the frame 
data, where l/2 or 1 bit per ounple increases in the transniission rate produce 
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Figure 6: Rate versus mean-square error for the 

Harry Reasoner image cooqnressed using theoretical 

eooqiression designs. 



MEAN SQUARE ERROR 


U 


little or no reduction in error. The theoretical designs obviously do not 
make the best possible use of the transmission rate. 

Figure 7 shovs the mean-square error obtained when the Harry Reasoner 
test image was coaq>ressed using experimental compression designs. The curves 
are smooth, and added rate always reduces error. The experimental designs 
give much Icwer error than the theoretical designs. The field compression 
curve for the experimental designs (figure 7) is nearly identical to the frame 
curve for the theoretical designs (figure 6), from 4 bits per sample dcwn to 
one bit per sample. The experimental designs used are similar to designs 
obtained by trial and error, but were generated using a formalized procedure 
based on the requirement of good representationfor both the edges and the 
low detail areas in video images. For a full discussion of the theoretical and 
experimental designs used, see reference 5« 

Figure 7 shews that, over most of the range of transmission rates, field 
compression requires a transmission rate about fifty percent greater than frame 
compression, for the same mean-square error. At the hipest rate shewn, 4 bits 
per sample, the mean-square error is caused by rounding all the transform vectors 
to ei^t bits, and all methods give about the same error. 

Figure 8 shows the moan-square error obtained using the experimental 
compression designs on all four test images, in both frame and field compression. 
Because of the wide range in the detail and correlation of the test images , 
the mean-square error st each transmission rate ranges over an order of magnitude, 
and the the mean-square error is plotted on a log scale. Even thou^ the test 
images differ greatly, the parallel curves of figure 8 shew that the increased 
rate required by field compression is nearly constant, at about l/2 bit per 
sample, for these images in the range of transmission ra'*'_s between l/2 and 2 
bits per sample. It seems that the frame or field compressioi^rade-off can 
be summarized as 7 bit frames of memory for l/2 bit per sample in transmission 
rate. 

The subjective impressions of the compressed images agree in quality ranking 
with the mean-square error results. Figure 9 shows the original image of Harry 
Reasoner. Figure 10 chews this image compressed using 1 bit per sample in the 
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Figure 7: Rate versus mean-square error for the 
Harry Reasoner Image conqpressed using experimental 
eoopression designs. 
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Figure 8: Rate versus mean-square error for the 
four test images compressed using experimental 
compression designs. 
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Figure 9 - The original Reasoner image 
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Figure 10; The Reasoner image processed as an interlaced 
frame at 1 bpp, using the experimental design. 
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frame, and figure 11 shows it compressed using 2 bits per sample in the field. 
The compressed images exhibit edge degradation, especially at the shoulders, 
lips, collar, and tie. The field image at 2 bits per sample has somewhat 
hl^er quality than the frame image at 1 bit per sample, as indicated by the 
error values of figures 7 and 8. (An original of the band Image is shewn 
in reference 6.) 

TIME EFFECTS 


The above comparison of frame and field video compression considered 
only the quality of the individual images, and ignored the effects of motion 
and the time sequence of images. The two fields in a frame are generated 
one-sixtieth of a second apart in time, and motion tends to make the correl- 
ation between adjacent lines in a frame lower than the correlation between the 
closest lines in the same field. A fifth test image, of a blurred hand moving 
rapidly over a writing pad, was compressed in the same way as the four other 
test images. Because of the motion, the mean-square error was lower for field 
compression. (An original of the pad image is shown in reference 6.) Transform 
compression, especially at lowar transmission rates, tends to average adjacent 
samples and lines. Two fields processed as a frame become similar, and hi^ 
motion areas where the original fields differ become blurred. In frame com- 
pression of hi^ motion scenes at low transmission rates, the motion update 
rate is the frame rate, thirty per second, rather than the field rate, sixty 
per second. Because the frame rate is adequate for representing motion, this 
is not an impairme*^t. 

These observations suggest a third approach to video compression. Since 
frame processing tends to average the two fields at lower transmission rates 
(which would reduce the vertical resolution and the motion update rate tc one- 
half their original values), frame compression is more similar to field repeat 
compression than to independant two field processing. In field repeat compression, 
only one-half the fields are transformed and transmitted, and each transmitted 
field is displayed twice at the decoder. Figure 12 shows the block diagram of 
a field repeat compre sion system. As the field to be transmitted is generated, 
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Figure 11: The Reasoner image processed as two fields at 2 hpp 

using the experimental design. 














naif the ®“^’^®.’^tnform?.tion is tra^mitted and half is stored. The full caap- 
ressed field is retained in the decoder, for repeated display. The total 
memory requirement for field repeat compression is 1 l/2 hit frames, rather 
than 7 hit frames for frame coitq>ression. A real time hardware system using 
the Hadamard transform and field repeat has been previously described. (6) 

Since a single field has only one-half of the saiiq>les in a frame, the 
same overall transmission rate is obtained when the cumber of bits per trans- 
mitted sample is doubled for field repeat. The overall rate is the number of 
bits per image multiplied by the u'isiber of images per second. Field repea» 
compression transmitts only one-half of the field Images used in frame or field 
coaq>resslor, as discussed above. The previous mean-square error results also 
indicate the performance of field repeat, since the same error in each transmitted 
field is obtained if one or two independent fields are transmitted. Fle.ld repeat 
compiession at 2 bits per sample has the same error as field comj>ression at 2 bits 
per sample, but the overall transmission rate for field repeat cccresponds to 
that for frame or field compression at 1 bit per sample. Field repeat compression 
can be compared to frame or field compression in figures 6, 7» and 8 by moving 
each point on the field compression cuzare to a point having th' same mean-square 
error and one-half the transmission rate. This shows that the error is sll^tly 
lower for field repeat compression than for frame compression, and much Icwer than 
for field compression. 

Figure 13 shews a field repeat image of the first field of figure 11. This 
field repeat image requires the same overall transmission rate as the frame 
processed image in figure 10, having one field at 2 bits per sample rather than 
one frame at 1 bit per sample, and the subjective quality is similar. The field 
repeat image has lower quality than the field compressed image, but that image 
has two independant fields at 2 bits per sample, and requires twice the overall 
transmission rete. In field repeat, vertical recolubion is noticeably reduced 
in detailed areas, and quantization noise and contouring are more apparent ic 
background areas. It should be reemphasized that field repeat compression is 
appropriate at the lower transmission rates, where it is not possible to provide 
the full potential resolution of uncompressed video. 
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Figure 13: The Reasoner image using field repeat. The first 

field is compressed using 2 bpp as in fighre U, and the second 
field is supplied hy repeating the first field. 
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CQiCUUSION 


Experimental simulations of interlaced frame and independant field 
compression systems indicate that frame compression can achiere a transmission 
rate about one-half bit per sample lover than fiela processing at a given 
iamige quality, with the added requirement of 7 bit frames of memory. Frame 
transform compression can be used at lover transmission rates than field 
compression, but replaces the two fields in the frame with two similar condbinations 
of the original fields. If it is decided to use only one field in field 
repeat compressicxi, performance similar to frame compression at lav transtoission 
rates can be obtained using only 1 l/2 bit frames of memory. A conditional 
replenishment cooq>resscr, vhich uses the correlaoion between sucessive frames, 
can be implemented using 7 bit frames of memory. (4) 
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APPENDIX B 


THE K-L, DCT, AND RELATED TRAIBFOiye 


OBTAINED VIA THE HADAMARD TRAIBFOIM 



I. I8TR0DUCTI0N 


It is well known that the I^rhunen-Loeve or eigenvector transformation 
provides the maximum possible data compression, and also that the discrete 
cosine transform is a close approximation to the i^rhunen«Loeve transformation, 
for hi^ly correlated data fitting the first-order Markov model. For less 
correlated data, the discrete cosine transform provides data compression nearly 
equal to that of the Karhunen-Loeve Wansform, even thou^ the transforms differ. 
"Hie Hadamard transform provides most of the potential data compression, but it 
always provides less data compression than the discrete cosine transform for the 
first-order Markov model. There is a general class of transforms, which are 
computed via the Hadamard transform, that can be designed to approach the perform- 
ance of the Karhunen-Loeve transform, and also to meet various restrictions which 
simplify hardware implementation. 
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II. THE KARHUWEN»LOEVE TRANSFORM 
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In the proof that the Karhunen-Loeve transform (KLT) Is the optimum 

orthcmormal transform with respect to the mean^square distortion measure, it 

is assumed that distortion is introduced by neglectic^^aae**S^\he Icwer energy 

transform coefficients. The KLT is optimum because the distortion, which is 

the total energy of the neglected coefficients, is minimum for any number of 

neglected coefficients. If the transmitted coefficients are described with 

2 

finite accuracy, the optimum coefficient bit assignment is well known. 


2 2 

*’i ' 5 <Tj > a 

. 0 «r/<d W 

2 

is the energy of the ith transform vector coefficient, and b^ is the number 
of bits used to transmit the ith coefficient, b^ is zero when the coefficient 
energy is less than the distortion,d, allocated for each vector. If M of the N 
vectors are transmitted, the distortion is 


N 

D = M d + 21 

i=Mfl 



The total bit rate for M transmitted vectors is 


(2) 


B = 


S‘. ■ 1 


/d) 


i i O-t - I 


M 


I log 2 ^ - I lo^ d (3) 
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Under the constraint that the energy of the transmitted vectors is fixed, 

or that the distortion is fixed^^tSI\o^al rafe,B,is minimum for the KLT. Suppose 

that the transform vectors are ranked In order of decreasing energy, and that a 

suboptimal transform is used. For some i less than j, the energy ,e, is transferred 
2 2 

from 0“- to cn . From equation 3> B is reduced by the suboptimal tiansform 
only if 



\ ^ ^2 ^2 
e) (<Tj ♦ e)< 



2 

e 


< 0 


^2 ^ ^ 2 

V. < CT . + e 

1 0 


Since this implies that the suboptimal transform has more energy compaction than 
the KLT, which is impossible, the rate for fixed distortion is minimized by the KLT. 

If is the correlation matrix of the data sample vector, and K is the 
matrix of the KLT, the correlation matrix of the KLT vectors is 


K 


;i 


^ is the diagonal matrix ( > ^2 » *** ’ where the vector energies 


are the eigenvalues of ^ 
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III. THE KARHUHEN-LOEVE TRAIBFORM OBTAINED VIA THE HADAMARD TRAIEFORM 

The eigenvectors y the rows of K, are even and odd vectors which can be obtained 

independently from the odd and even Hadamard vectors. The general implementation 

2 

of the KLT requires N multiplications. These can be replaced by a Hadamard , 

transform followed by /2 multiplications . 

The elements of are the sample correlations, c(x^,x^), where i and 

1< i,J <N, indicate the sample locations. Suppose the correlation is stationary, 

3, Pl09 . 


c(x^,x. ) = c(|i-j|) . has the form 



c(0) c(l) c(2) ... c(N) 


crl 

c(l) c(0) c(l) ... c(N-l) 

= 

rr2 

• 

• 

• • • 

c(N) c(N-l)c(N-2)...c(0) 


• 

crN 


Rcw crl is the transpose of rcw crN^and rw cri is the transpose of row cr(l'h-l-i). 

The Walsh-Kadamard transformy^is aefined for N a power of 2. ^ The WHT is 
orthonormal. The rcws and columns of the WHT matrix, H, are 

the transform vectors, and have sample weighting values of +1 and -1. The WHT 

^T 

vectors are even and odd. H is written as columns of orthonormal transform vectors 

hcl I hc2 I ... I hcN 

We assume that H is in bit reverse order so that the first N/2 columns are even 
vectors, and the second ;>/2 are odd vectors. 

We compute the WHT correlation matrix, to shew that the even and odd ^'TEIT 
vectors are uncorrelated. 
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crl 
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crN 




^11 ^21 * * * ^ N 1 

^12 ^22 "* ^IJ2. 

• • « 

^IN ^2N * * ' 


cr(l(fl-i) hcj = 


Y = lycl I yc2 I • • • I ycN 


Because cri is the transpose of cr(I^^l-i), y.. = cri hco = 

<3 ^ 

whenever hcj is e^en, and Y-y = cri hcj = - cr(Ifrl-i) hcj = 
hcj is odd. Writing Y in terms of columns 


yci is even if hci is even; yci is odd if hci is odd. Since the rcws of H are the 
same as the columns of K, and since even and odd vectors are orthogonal, it follows 
that half the elements of cero. 


I 

H > 


H - H Y = 


^ I 0 

^KL • 

4 - 


V;? 


H2 
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quality 

TVio even WBT vectors are uncarrelated with the odd vectors, and the 
completely uncorrelated KLT vectors can be obtained by operating independantly 
on the even and WHT vector sets. 


ORIGINAL 

Op POOR 


Si ^-1 
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’l~4^ 
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Since a Linear combination of even/odd vectors is even/ odd, the KLT vectors are 
even and odd. The KLT can be obtained, following the V^T, by 2 (N/2)^ * 1^/2 
multiplications. This matrix factorization is the basis of implementation of 
approximations to the KLT described below. 


i, 


^ The separation of vectors into even and odd groups occurs in the first operation 

i, pl06 

of the bit reverse ^VHT. 
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IV. THE DISCRETE COSINE TRAJBFORM QBTAIMED VIA THE HADAMARD TRAMSFORM 

The discrete cosine transform (DCT) also has even and odd vectors^' \ and 

Hein and Ahmed have shown how the DCT vectors can be obtained by a sparse 

matrix uiultiplication on the WHT vectors. 5 since the DCT, unlike the general 

KLT, has a constant vectcr and a shifted square wave vector in common with the 

WHT, the number of matrix multiplications is fewer than N^/2. The T matrix 

(equation 4) which generates the DCT vectors for N =8 from the WHT vectors 

is given by Hein and Ahmed, and ia reproduced here as figure 1. While this 

implementation of the DCT requires more operations for large N than the most 

6 

efficient DCT implementations , it is very satisfactory for smaller N. 

If a transform has even and odd vectors and has ; constant vector, as is 
typical, it can be obtained via the WHT in the same way as the DCT. The slant 
transform is an example. A hardware implementation of the DCT via the WHT 

is being constructed at NASA-Ames Research Center, using N = 8 and the matrix of 
figure 1. Since this implementation contains the matrix multiplication factors 
in inexpensive read-only memories, it will be possible to consider the real-time 
quantization design and evaluation of a large class of transforms. Transforms 
with suboptimum performance are acceptable only if they can be implemented with 
reduced complexity. Transform performance can be determined theoretically from 
the vector energy compaction, wk ■. the implementation complexity can be estimated 
from the number and type of operations added after the WHf. 
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A = 


; 1.0 I ^ 

i-o"^rb"^3 

L-5*i§3_0.^3| 

I 0.907-0.075 0.375 o.iSo 
1 0.21U 0.768-0.513 0.31B 
0 j-0.3l8 0.513 0.768 0.21U 

w0.l80-0.375-0.075 O.907 


Figure 1: The A matrix used to obtain the DCT from the WHT. 
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V, COMPARISON OF TRAICFOH-S USING THE fPST ORDER MARKOV CORRELATION MODEL 

It is generally accepted that. the sample-to-saraple correlatloa of an image 

8 

line scan is approximated by the first-order Markov model. 


c(Xi»x ) = c( I i-j I ) = r 1 I 


(5) 


r is the corelation of adjacent samples, and varies from 0,99 - or low detail 

9 

images to 0.80 for hi^ detail images, witli an average of about 0.95. The 
correlation matrix, was generated using the first-order Markov model, for 

various r, and the corresponding KLT's and vector energies were numerically 
computed. In addition, the matrix was used to compute the transform 
vector energies and correlations for the WHT, DOT, and other transforms, using 
equation U above. 

As is well kncwn, the KLT vectors for r = 0.95 are very similar to the DOT 

1 4 

vectors, and have nearly identical vector energies. The most apparent 

difference between the DOT and tne KLT is that the KLT vector corresponding to 
the constant DCT vector 5.s not exactly constant, but weights the central samples 
in a fixed transform block more than samples near the edge of the block. As r 
approaches 1.00, this IXT vector approaches the constant vector, and all the KLT 
vectors approach the corresponding DCT vectors. As r becomes less chan 1.00, 
the hi^er wei^ting of the central samples increases. The wei^ts of the first 

and fourth samples, for N =8, are given in table I. As the sample correlation 

decreases, the central samples provide a better estimate of the average sample 
value than do the extreme samples. The vector energies of t’le KLT and the DCT 

are nearly identical for r greater than O.9O, and differ only sli^tly for r 

greater than O.5O. The KLT and DCT vector energies for N = 8 and r =0.50 are 
plotted in figure 2. The energy compaction at r = 0.5 is much less than at the 
typical r = O.95. 

„ 10 
W The analytic solution is known. 
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Table I. 

The weights 

of the first and fourth samples 


the KLT, for 

N « 8, and various correlatiooB 

Correlation 

Weight of 

Wei^t of 

r 

the first 

the fourth 


sample 

sample 

0.999 

0.353 

0.354 

0.99 

0.350 

0.356 

0.95 

0.338 

0.364 

0.9 

0.324 

0.374 

0.8 

0.296 

0.392 

0.7 

0.272 

0.407 

0.6 

0.250 

0.420 

0.5 

0.230 

0.430 


1C 


A 
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The rate-dlstorti(Xi perfomiance of a transform depends on the transform energy 
compaction, as shorn in equations 2 and 3* For snail distortion, d is less than 
for all i, and all N transform vectors are quantized and transmitted. The 
number of bits required is 

N 

B » I ^ - I logg d (3) 

The first term of B can be used as a figure of merit for a transform. 

F = 5 ^ 

i=l ^ " 

F is a negative number; the larger its magnitude, the greater the rate reduction 
achieved by the transform. Table II gives the rate reduction factor, F, for the 
KLT, DCT, 'aTIT. and two similar transforms that will be described below. At 
correlation r = 0.95, the :XT gains O.OlU bits more than the DCT, and I.I83 bits 
more than the WHT. The V/HT achieves most of the available data compression, and 
the DCT achieves nearly all. As this rate reduction is obtained for all N vectors, 
the increased compression of the DCT over the WET, for r = O.95, is l.l£9/8> 

0.15 bits per sample. 
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Table II. The figure of merit, F « Z for different 

transforms at N s 8 ai^ Various correlations . 


Transform 


Correl- 

KLT 

DCT 

ation, r 



0.99 

-19.817 

-19.775 

0.95 

-n. 7 i ^3 

-11.729 

0.90 

- 8.379 

-8.341 

0.80 

- 5 . 3£2 

-5.092 

0.70 

-3.U02 

-3.328 

0.50 

-1.453 

-1.396 

0.00 

0.00 



WHT 

B matrix 

C matrix 

- 18.489 

-19.205 

-19.597 

-10.560 

-U.206 

-11.558 

-7.311 

-7.875 

-8.180 

-4.317 

-4.731 

-4.954 

-2.765 

-3.056 

- 3.214 

-1.136 

- 1.261 

- 1.333 
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VI. OTHER TRA.NSFOIUC OBTAIMED VIA THE HADAMARD TRAIBFORM 

If a transfom with good performance .d sloQ^er implementation than the DOT 
is required, approximations to the DCT obtained via the WHt can be considered. 

The matrix multiplication of the WHT vectors which produces the DCT for N = 8 , 
first given by Hein and Ahmed, is shewn in figure 1 above. The sequency of a 
transform vector is defined as the number of sign changes in the vector. The 
vector sequencies of the vectors corresponding to the matrix of figure 1 are in 
bit reverse order, as indicated (0,U,2,6,1,5,3,7) • The energy caspactioa of 
the WHT and 3 >'T for r = O .95 and H = 8 is shewn in figure 3« In the con*version 
from WHT to liOT, the tvfo by two matrix operation on vectors 2 and 6 trans+‘ers 
energy from 6 to 2. The four by four matrix operation on the vectors of sequency 
5 , 3 , and 7 reduces the energy of 3 , 5 , and 7 and increases the energy of 1 , 
These operations remove most of the residual correlation of the WHT vectors. The 
matrix multiplication requires twenty multiplications, by ten different factors 
( fifteen factors including sign differences) . 

We first ''onsider a simplified operation on the 2 and 6 and the 1 and 3 
sequency vectors. This operation consists of multiplying the WHT vectors by 
matrix B, given in figure 4. This further transform is designed to reduce 
correlation and to generate new transform, vectors in a way somewhat similar to 
the A matrix multiplication which produces the DCT. There are two identical 
two by two operations, and a total of eight multiplications by two different 
factors (three including sign). The energy compaction of the B matrix transform 
is shown in figure 3, with the energies of the WHT and DCT. As the B matrix 
transform vectors of sequency 0, 4, 5 , and 7 are identical to the WHT vectors, 
they have identical energy. The B matrix transform vectors of sequencies 0, 1, 

2, 3, 4, and 6 are identical to the corresponding DCT vectors (O, 4) or very 
similar. For example, the B matrix vector of sequency 1 is a slanted vector 
of step width 2 and step size 2 (3, 3 , 1,1, -1,-1, - 3 , -3) . The perr:r”iance of the 
B matrix transform, in terms of the figure of merit, is given in table II above. 
The B matrix transform has something more than one-half of the gain of the DCT 
over the WHT, with something less than one-half the multiplications, and less than 
one-fourth the hardware if the two by two transformer is used twice. 
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Figure 3* The energy compaction of the DCT, WHT, 
and B matrix tranaforms for Ns8 and rcO.95* 
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Figure U. The B matr^. 
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As a second example, suppose that it is desired to approximate the DOT by 
adding integer products of the WHT vectors. For small integers, this operation 
can be implemented by digital shif ts-and-adds , and required fewer significant 
bins to be retained. The natrix C, given in figure 5> is an orthonormal transform 
matrix that is similar to the DCT. The two by two matrix, operating on the 
vectors of sequency 2 and 6, is a specialization of the general two by two 
matrix having orthogonal rows with identical factors. 



B 


SgA 


Sj^ and Sg are plus or minus one, and “ Sg* The four by four operation on 

the vectors of odd sequency is a specialization of the general foui by four 
matrix with orthogonal rows, identical factors, and the additional requirement 
of a positive diagonal. 


A 

B 

C 

D 

-B 

A 

s^D 

s^C 

-C 

SgD 

A 

SiB 

-D 

SiC 

s/ 

A 


As before, s^ and are plus or minus one, and s^ = - Sg. 

The specializations of the general matrix were made by requiring that the 
two by two matrix integers have approximately the ratios found in the second (and 
third) rcws of the X matrix, and that the four by four matrix integers have 
approximately the ratios found in the fifth (and eighth) rows of ■^he T natrix. 
Since the A matrix transform is the DCT, this insures that the C transform vectors 
of sequency 2, 6, 1, and 7 will approximate the corresponding DCT vectors. 
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The energy conipactiou results of the C transform, with the results of the WHl 
and DCT, are given in figure 6, for r = 0.95 and N « 8, The energy vectors 

of sequency 2, 6, 1, and 7 is very similar to the energy of the DCT vectors, hut 
the vectors of sequency 3 and 5 are different. The energy correspondence could 
be improved by matching the four by four matrix factors to the average of the 
fifth and sixth rcws in the T matrix, but there is little potential data compression 
remaining. The theoretical performance of the C matrix, in terms of the figure 
of merit, is given in table II. The S* matrix transform obtains nearly all the 
gain of the DCT over the WHT. If the rational form, instead of the integer form, 
of the C matrix transform were used, the computation would require sixteen multi* 
plications by four different factors (seven factors including sign differences) . 
There is some reduction in complexity from the implementation vf matrix A. 
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Figure 6. The energy compaction of the WHT, DOT, 
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VII. EXPERIMEOTAL IMAGE C<MTOSSION RESULTS 

Experimental results were obtained for two-dimensional, 8 k 8 sample block 
implementations of the transforms considered above. Four video test images, 

Earry Reasoner, Two girls. Two men, and Band, were used in all tests. These 
Images have correlation of 0.97 to O.98 between elements in the scan line, and 
fit the first order Markov model, except for the very detailed Band image, which 
deviates from the Markov model and has an average in-line correlation of 0.85. ^ 

Two different compress ion experiments were made. 

The test images were first compressed by representing either thirty-two or 
sixteen of the sixty- four 8x8 transform vectors, using an eight bit uniform, full 
range quantizer. The other vectors were neglected. The patterns of the vectcjrs 
transmitted and neglected are given in figure 7. The vectors are in sequency 
order, with the lowest sequency average vector in the upper left corner of the 
pattern. The mean-square error for this compression method and the four transforms 
is given in table III. The B matrix transform error is intermediate between the 
WHT and DCT errors, and the 5 * matrix error is very close to the DCT error. This 
is consistent with the Markov model energy compaction results above. 

To obtain the greatest transform comprecsion, the transmitted bi^s should be 
assigned to the vectors according to equation 1, and the coeij?icicnt quantizers 
should be designed for minimum error given the coefficient energy and amplitude 
distributions. The optimum theoretical bit assignments and quantizers depend on 
the particular transform used. The test images, and most typical images, contain 
low contrast, hi^ correlation background areas, and edges where correlation is 
low. The bit assignments and quantizer designs based on the stationary Harkov 
model ignore this nonstationarity, and designs which consider low contrast areas 
and edges give improved mean-square error and subjective performance. Such 
improved designs have been devised for the WHT, and have been tested with the DCT, 

B matrix, and ^ matrix transforms. The transmission rate and mean-square error 
results are given in figure 8, for the test images compressed in the video field. 

The DCT gives improved error performance, and the B and C matrix transforms are 

A# 

intermediate, but the B and C matrix results are relatively poorer than in table III. 
The DCT gives more rate reduction than the WHT, about 0.2 to 0.5 bits per sample. 

2 

As a two dimensional transform has twice the gain of a one dimensional transform , 
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Figure 7t Patterns of vector coefficients retained and neglected. 
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Table III. Tbe mean-square 
transforms wiyi 


error for tne WHT, DCT, 3 matrix and 
a subset of vectors retained.* 


Mean-square error for 32 vectors retained 


Reasonei* Ivo girls Two men Band 


WHT 

0 . 5 :d 

O.8O6 

1.694 

3.948 

B matrix 

0.500 

0.738 

I.58I 

3.626 

C* matrix 

0 . 4 U 2 

0 .b 66 

1.538 

3.310 

DCT 

U .446 

0.660 

1.535 

3.056 


Mean-square error for 16 vectors retained 


Reas oner Two girls Two men Band 


WHT 

1.619 

2.206 

4.601 

12.322 

A/ 

B matrix 

1.507 

2.093 

4.557 

12.056 

C matrix 

1.427 

2.029 

4.447 

11.897 

DCT 

1.430 

2.031 

4.406 

U.626 


matrix 
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Figure 8* 'transmission rate versus error foi the four test images# 
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the theoretical gain of the DCT over the MHI, for r « 0.95, should be tvice 
the 0.15 bits per sample of table II, or 0.30 bits per saoqple. 

The lorer error of the DCT, B matrix, and^ matrix transforms does indicate 
subjective improveinent in the compressed images. This subjective imprcvement is 
larger at lover total bit rates, due to the relative increase of larger, more 
noticeable errors at the lover rates, am due to the more objectionable, blocky 
nature of large WHT errors. The B and C matrix errors are subjectively more 
similar to the DCT errors than to WHT errors, because the higher energy vectors 
approximate the DCT vectors. 

It is not surprising that a design optimized for the WHT gives good results 
for fche DCT and similar transforms. The transform conqu-ession introduces errors 
in three ways; by neglecting vectors, by using too mrrov quantizers, and by 
quantization errors within the quantizer ranges. The DCT, because of its 
.uperior energy compaction, reduces the first two sources of error. Although 
the quantizers used are quasi-uniform, they do have smaller quantization steps for lor 
coefficient values, so the third source of error is also reduced. Any compression 
design will give better performance with the DCT. From the similarity in energy 
compaction, a good design for the WHT should be reasonably effective for the DCT. 
However, further performance gains can be made with the DCT by optimizing the 
compression designs for the DCT. 

The error statistics s!».Otf that the lower mean-square error of the DCT is 
due both to fewer large errors, which nearly' always occur at edges, and to 
fewer small errors, which occur flat areas ard edges. The subjective 
appearance of the compressed image confirms that the DCT produces both smoother 
lew contrast areas and less distorted edges. Since the low contrast areas have 
very hl^ correlat.- on, and since the edges - thou^ not noise like - can be 
approximated by a low correlation Markov model, the mean-square error and 
subjective results agree with the theoretical result that the DCT is superior 
to the WHT for all values of correlation (see table II). 
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vm. ccHicnBioK 

The Karhunen-Loeve transfom for data with stationary correlation and the 
discrete cosine transform are members of a general class of transforms that 
can be obtained by a matrix multiplication of the Hadamard vector coefficients. 
Thlp implementation reduces the number of multiplications required. If reduced 
compression gain is allowed, the implementation complexity can be furUier reduced. 
The theeretical cata compression seems to be a reliable indicator of the differ^ 
entlal in experimental performance of these transforms. 
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FRAME AVERAGING 



This report discusses experiments made in frame averaging. As reixnrted previously, 
frame repeat has been combined with conditional replenishment, to cope with hig^ 
change rates. If the fixed average transmission race of a conditional replenishment 
system is set at a low value, which would allcw compression without repeats for 
typical teleconference material, the changed frames in hl^ motion scenes can not 
be fully transmitted in one frame period. Reduced resolution for the changed frame, 
or updating the changed frame in segments, are less deslreable than frame repeat. 

Frame repeat requires an additional one-field memory at the receiver. 

The number of times a frame is displayed (the number of repeats plus one) is 
shown in Table I, for color and monochrome test scenes processed at different bit 
rates. The number of times a frame is displayed is equal to the number of frame 
periods that are required to transmit the next changed frame. Displaying a frame 
three times, a rate of ten frames per second, is perceptably jerhy, but not very 
objectionable. Aithou^ displaying a frame four to eight times is sometimes very 
Jerhy, the effect is not intolerable. The subjective effect of frame repeat depends 
on the kind of motion. In Man and Tool, the original motion is rapid and discontinuous, 
changing speed and direction. The frame repeat accentuates this. In Cars, the 
motion in the first part of the scene is smooth and continuous, and the frame 
repeat at l/l6 bpp remains smooth. 

Frame averaging has been used at Bell labs, and was simulated because it was 
expected to reduce the jerkiness of frame repeat. Frame averaging is practically 
mandatory when frames of interlaced fields are repeated, because the forward and 
reverse field display order is otherwise very objectionable. In the current 
conditional replenishment simulation, changed regions use a field repeat, so that 
frame rveraging is not required. 

In the operation of conditional replenishment, a frame is repeated until the 
next frame is fully transmitted. See the diagram and discussion given in the 
introductory section aoove. The number of displays of a frame depends on the 
amount of change in the next frame, but the amount of change usually varies slcwly 
from frame to frame. After the last portion of a frame has been loaded into the 
transmitter output buffer, the next frame is input, transformed, and change detected 
during the next frame period. The number of repeats of a frame can be transmitted 
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Table I: The nuodber of 

times a frame 

is displayed. 
• 



Rate, bpp 




1 1/2 

1/k 

1/8 

l/l6 

Scene 





Wheel of Fortune 

1 2 

3 



Water Skiers 

1 2 

3-4(3.U) 



Man and Tool 


1-2 (1.2) 

2-4(2 .9) 

4-8(6.0) 

Man and Book 



1-3(1.9) 

2-5(3.9) 

Three People 



1 

1-2(1.6) 

Cars, part 1 



2-3(2.2) 

3-7(4.4) 

Cars, part 2 



6-7(6. 3) 

11-12(11.5) 


The number in brackets 

( ) is the average. 
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as soon as the last part of that frame is transmitted; This information is required 
for frame averaging. 

The simulated algorithm for frame averaging uses a gradual mixture of the prev« 
ious frame (A) and the most recent frame (B), Suppose that the next frame (c) will 
require four frame periods for transmission. The previous frame (A) has just been 
displayed in unmixed form. Instead of simply displaying the most recent frame (B) 
four times, mixtures of the previous (A) and most recent (B) frames will be displayed 
for three frames, then the unmixed most recent frame (B). If the next transmission 
also requires more than one frame period, this will be followed by further mixed 
frames . 

Suppose that a sequence of .'rames require four frame times for transmission. 

Table II shows the time sequence of input, transmitted, and displayed frames. The 
expression (2:1,5) designates the mixture of frames 1 and 5 used to replace the 
untransmitted frame 2 in the display. It seems reasonable that when a frame is 
closer in sequence to some displayed unmixed frame, it should have more of that 
frame in its composition. The formula used to define the frame average (I:J,K) 
is as follows: 


(I:J,K) =(|i-k|/ |J-K|) J+ (|I-J|/[J-K|) K 

I, J, and K refer to the frame order numbers, and J and K refer to the actual 
frame data. For example: 

(1:1,5) =1 

(2:1,5) = (|2-il/|l-5|) 1+ (|2-li/|l-5|) I 
= (3/4) 1 + (1/4) 5 
(3:1,5) = (2/4) 1 + (2/4) 2 
(4:1,5) = (1/4) 1 + (3/4) 2 
(5:1,5) =1 


The frame displayed in place of the untransmitted frame 2 is an average made by 
adding 3/4 times frame 1 and l/4 times frame 5* 




Frame order number 123 456 

Frame input 1 5 

transmitted 11 115 

Frame repeat display 1 

Frame average display 

ft 

ft 
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transmitted, and displayed frames. 

7 8 9 10 11 02 13 14 

9 13 

555999913 

11155559 

1 (2:1,5) 

(3:1,5) 

(4:1,5)5 
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There are several hardware requirements In Imj^ementlng frame averaging. 

Knofledge of the number of frame times for the next frame transmission was mentioned 
above. Frame averaging requires a third one-field pemory at tke receiver, to 
hold the previous frame during averaging. Circuitry to combine the frames in 
different proportions is required. The averaging can be done in the transform 
domain, using digital circuitry. If both frames were inverse transformed and 
buffered, they could be combined using analog circuitry. 

Frame averaging was tested using conditional replenishment compressed material, 
consisting of the scenes in table I (except Three People) compressed at the lowest 
rate shown in the table. The two color scenes. Wheel of Fortune and Water Skiers, 
were compressed to l/4 bpp and required three displays of each frame (except the 
early part of Water Skiers, which required four displays). Frame averaging provided 
smoother motion, and the detailed moving wheel was improved. While x:hese scenes 
were not objectionable using field repeat, they were better with frame averaging. 

The two averaged monochrome scenes which have been displaved in real time are 
Mand and Tool and Man and Book, which were compressed to l/l6 bpp and require an 
average of five displays per frame. These scenes are very jerky using frame repeat, 
and are not improved using frame averagir.g. The moving objects - the tool and book- 
are strangely blurred, and appear to expand and shrink in time. This poor performance 
is apparently due to both the l^w frame rate and to the different type of motion. 

As computed from Table I, the average frame rate for these scenes is six per second, 
rather than ten per second for the color scenes, and the lowest rate is less than 
four per second. Both monochrome scenes have a well defined object moving rapidly 
and d is continuously over a featureless background, and frame averaging causes the 
shape changing artifact mentioned above. The Cars scene was process 1 with frame 
averaging, but not displayed. Since the frame repeat scene is smooth, the frame 
average is expected to be good. The Three People scene was not processed using 
frame average, as the number of repeats at l/l6 bpp is very low. 

Because of the frame averaging artifact described above, the frame averaging 
was modified to limit the number of averaged frames, and to use repeated frames to 
fill the remaining display periods. If N is the maximum number of averaged frames. 
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the expression (I:J,K) is modified as foUov/s: 

(I;J,K) = ( |I-N-2| / Jr-N-2j ) J+ ( |l-j| /|j-F-2/ ) K, for l<h-2 
= K , for I>Iff2 

The x>3^evious example changes as follovrs for N=2: 


(1:1,5) 

= 1 


(2:1,5) 

= (2/3) 1 + 

(1/3) 1 

(3:1,5) 

= (1/3) 1 + 

(2/3) 1 

(i+:l,5) 

= 1 


(5:1,5) 




After frame 5 i^ received, the display consists of two averaged frames and two 
repeated frames. For N equal to 2, there are never more than two averaged frames, 
and the remaining frames displayed are repeats of the most recent frame. The Man 
and Tool, l/l6 bpp, test scene has been averaged with N equal to one and two, but 
has not been displayed in real time. It is expected that limited frame averaging 
will smooth scenes with nigher frame rates, without adding noticeable artifacts 
where lower frame rates are used. While a single averaged frame probably gives 
minimal improvement, its implementation does not require the third memory mentioned 
above. 

It appears that frame averaging gives subjective improvement at moderate frame 
rates, but can not cope with low frame rates. If this is correct, frame averaging 
is not a hi^ly effective investment in hardware. However, it might be justifiable 
as an improvement in the limiting mode of a sy^ :em designed to operate at ten frames 
per second or more for all change levels. A system using motion tracking might be 
ah]e to interpolate between frames in a sophisticated way, but would be much more 
complex. 
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nils report describes the computer work perfoimed under contract NAS2-9703. 

This vork Includes the generation of computer programs, the performance of comp> 
uter simulation test r<ons, and the transfer of programs and* video data to the 
nev SEL 32 computer. 

Table I lists the computer programs developed or transferred to the SEL 32. 

Not all the programs developed imder NAd2-9Y03 havs been transferred to the nev 
computer. The f'jnction of EbfE is included in EbBF. IBEP and TCOR are all Fort- 
ran, and can be easily transferred, but the studies performed using these programs 
have been completed, reported, and submitted for publication, so tnat no further 
need for these programs is anticipated. TbXb performs cosine and quasi-cos ine 
intraframe transform compression. A cosine transform program is needed, tat T 8 x 8 
uses 840 machine language subroutines developea by David hein. E? 8 f simulates the 
latest conditional replenishment method, and FRAVG generates fraae ■’ve^aged displays. 
Programs REFMT and ERFMI are deirived from a program written by Lariy Hofman, which 
displays on the SEL 32 the b432 fromat tapes written on the SEL bUO. These pro- 
grams convert the six bit video samples to the most significant six of eight bits, 
while otherwise retaining the previous format. 

In the second part of table I, the list of programs transferred t', the SEL 
32 includes many frequently used service programs. Notable by their absence are 
programs to record images (currently impossible), to measure mean-square error 
(a t»*ansferred version of DDIF needs debugging), or to create Dicomed format 
images from D or E format video tapes. 

Table II lists the original video data files transferred to the SEl 32. 

A few of the files in the sequences D101-D12b and DI 5 I-DI 7 O are missing, but all 
files used in past studies are available on the SEL 32. Sequences ET02, ET04, 
a -d ETC5 are not transferred because of poor video quality, but ETOb may be use- 
able. All sequences used in previous studies are available on the SEL 32. 

Table III lists the conditional simulation test runs made under this contract. 
Approximately thirty others were made under a previous contract, using an earlier 
conditional replenishment method. Many of these simulations were record:’’ -^n the 
Echo Science disc, for reaJ time display, but most have been overwritten ’r ’lave 
been degraded by time. The Echo Science disc can be r°:»diiy overwritten from the 
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Table I. CCMWCER 


f/- 

'i 




Prograir. Foncti<» 

Programs Developed for NAS2-9703 

Revised Moao^' rome Conditional Replenishmei 
Landsat E3.emeat Replication Compression 
Quasi Cosine Transform Correlation 
Cosine/Quasi-Cosine Compression 
Color/Monochrome Conditional R plenisnment 
Frame Averaging 
8U32 D to SEL 32 
8U32 E to SEL 32 

Programs T'ansferred for IIAS2-9703 

Intraframe Hadamard Compression 
Fie’d to Frame (D) 

Frame to Field (D) 

YIQ to RGB 
RGB to YIQ 
D to E 
E to D 
D Copy 
E Copy 
D Display 
E Display 

Frame 1 Partial Test Ramn 
Color Test Bars 



SEL 840 Name SEL 32 Name 


e88e 

not transferred 

IREP 

tt 

TCCR 

R 

T«X« 

tt 

E86F 

£8bF 

FRAV 

FRAVG 


REFMT 


ERIMf 


e8x8 


e8x8 

DINT 


DINT 

DSTD 


DSTD 

EYER 

DYIQ 


ERTOY 

DTOE 

FTOD 


DORE 

rCPY 

ECPY 

} 

ECOPY 

DDSP 

EDSP 

L 

DISP4 

RAMP 


JRAMP 

KCOL 


KCOL 




Files 

D101-D128 

DI 5 I-DI 7 O 

D201-D2CC 

DC01-DC24 

LSOI-1504 

ETOl 

ET03 

ET06 

ETO 7 

EC09,EC10 

EC12,EC13 


Table II. ORIGiriAL VIDEO FILES TRAN5FEBRED 
File Descript i«i 
four frame sequence, nonochrooe 

It tt tt n 

single frame, monochrome ' 
single frame, color 
Landsa\. frame, D format 

Vb^n and Tool, 6^ frame sequence, monochrome 
Cars , " 

Man and Book, “ 

Three People, " 

ECU Wheel of Fortune, 59 frame sequence, color 

Water Sking, U6 " " " 


•I 

If 

r 

u 

u 

tt 


tt 

tt 


3 



Table III. CONDITIONAL REPLEOTSHMEWT SIMULATION 


Number 

Input File 

Rale, bits per sample 

Monochrome Slmolations U~ .ng e88E 


El 

ETOl 

lA 

E£ 

ETOl 

i/« 

E3 

BT07 

1/8 

EU 

ET06 

1/8 

E5 

ET03 

1/8 

E6 

ETOl 

1/16 

E7 

ETOl 

1/16 

e8 

FT06 

1/16 

E9 

ETC? 

1/16 

ElO 

ETC-3 

1/16 

Color Simulations Using E&5F (YIQ) 

RGB Version 


FI 

EC09 

1/2 


F2 

EC0Q,10, 11 

1/2 

X 

F3 

f# 

1 


F4 

EC 12, 13 

1 


F5 

It 

1/2 

X 

F6 

EC09,10,11 

1/4 

X 

r? 

EC 12, 13 

1/4 

X 
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SEL 32 or ttie S£L (i40, but tbe siimilatioii data has not been cfsrrerted to the 
SEL 32 format. 13ie original video sequences, and the conditionax replenishment 
sifuilation programs have been converted. A monochrooie conditional replenishment 
simulation requires about ei^t hours, while a color simulatxon requires about 
four hours per tape. The simulations of table III required about 140 hours of 
computer time. Conversion form the SEL 840 to the SEL 32 requires about one hour 
per tape on each computer, for a total of 25 hours if both computers are run 
simultaneously. Alternately, the tapes of table III could be converted at the 
computer center for about 25 dollars each, a total of 625 dollars. 

Single caapressed images too numerous to list were produced under this 
contract. They were generated in the design of monochrome and color compression 
bit assignments and quantizers, in the investigation of Landsat element replication 
compression, and in the comparison of Hadamard, cosine, and quasiOcosine transforms. 
Because of the much smaller amount of data, these images could be reproduced or 
transferred more readily than the conditional replenishment files. 

It would be useful to have the SEL o40 available to display previous results, 
generated in many past studies, out it is far from necessary. An adequate basis 
tor future work exists on the SEL 32. 
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SECTION C 


LANDS AT IMAGE PROCE55SING 
ELEMENT REPLICATION COMPRESSION 



CCHITENTS 


ELEMEKT REPLICATION COMPRESSION 1-36 

/i>FENDIX A A BRIEF REVIEW OF LANDSAT IMAGE 


CLASSIFICATION 


37-45 



I. INTRODUCTION 


Image traosmlsslon rate compression methods either fully preserve the original 

image information, or introduce errors to some degree. Information preserving 

compression methods for moltispectral images often compute the predicted value of 

the next image element spectral values, using previously transmitted information, 

and transmit entro^ coded differences between the predicted and actual values. 

The transmission rate can closely approach the actual information content of 

the image. Experiments with Landsat images, which have four spectral bands, have 

shown that the transmission rate can be reduced from seven or ei^t bits per 

( 2 ) ( 3 ) 

element in each band to three or four bits per element in each band. ' '* ' ' 

Because of the presence of fine spatial detail, and because the two visible li^t 
bands have small correlation with the two infrared bands, Landsat images have 
hi^er info; ^tion content than color television images, and may appear confused 
to an untra- aed observer. 

Non- information-preserving compression methods remove spatial and spectral 
information which either is not needed to preserve information or is not useful 
enough to . astlfy the cost of transmission. Habibi has recently reviewed adaptive 
compression techniques, and has classified those which introduce error as predictive, 
trafiSform, or clustering. Predictive techniques can introduce error by 

transmitting witli reduced accuracy, some of the possibJe differences between an 
element and the predicted value of the element. Image basis vector transforms in 
the spatial or spectral domains introduce error if the transform coefficients are 
transmitted with less than full range or accuracy. When predictive and spatial transform 
compression techniques are used at rates less than two bits per element per band, 
fine spatial detail is objectionably blurred. 

Clustering is a familiar method of aultispectral data classification, and has 

^ 9 ), ( 10 ), ( 6 ) 

been combined with multispectral image compression for transmission or storage. 

The image elements in a region are grouped into four-dimensional spectral clusters, 
and the four-dimensional centroids of the clusters and the cluster designation of 
each element _ transmitted. Because the errors are spatially uncorrelated with 
*■' T.- . theri is no spatial blurring in the compressed image. Cluster coded 
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images have lower error and better subjective performance at low transmission rates 
than images compressed using predictive or transform techniques. Cluster coding 
C8‘* as errors between the image elements and the centroids used to represent them. 
These errors are usually invisible in isolated elements, but sometimes cause minor 
contouring when many adjacent elements have the same centroid. Cluster coding has 
two disadvantages. Many computations are required to group the image elements into 
clusters, and to define the centroids. Because of the limited number of clusters, 
untypical elements may be represented with large error. 

Picture element replication has been investigated, to develop a new method 
spectral, non^spatial, compression without the disadvantages c.. cluster coding. 

In replication compression, a table of previously transmitted elements is maintained 
at both the transmitter and receiver. The distance from the current image element 
is computed. If a stored element is within the prescribed error distance of the 
current element, the table indication of that stored element is transmitted, and 
the current element is represented by the stored element in the receiver image. If 
many image elements are spectrally similar, the transmission rate approaches the 
nuoiber of bits required to indicate a particular table entry. If no stored element 
is within the required distance, a special indicator word and the true element 
value are transmitted. 


2 



II. BEFLICATION COMPRESSION AND RANDOM CODING 


ORIGINAL PAGE IS 
OF POOR QUALITY 


Image cooqpression by element replication is similar to the method of random 

rate- 
(IU)-(1£) 


coding) which achieves the rate-distortion bound. Proofs of the rate- 


distortion theorem are related to proofs of the channel capacity theorem. 

The rate-distortion proof sh<ws that, as the number of signal dimensions, n, becomes 

large, a random selection of M = 2^^^^ representing elements will have an element 

capable of representing any element to be transmitted with distortion less than d, 

with probability approaching unity. For mean-square error distortion and iudependant 

__ _2 -2R(d) 

gauss ian random variables of variance d = <T 2 . Each transmitted element 

can be indicated bj'- the coded designation of a representing element, at rate R(d) , 
with distortion less than the bound, d. The proof of the converse theorem sha/s 
that no possible source coding method improves performance. 

The result of the theorem, that there is (with probability approaching one) 
a representing element with less than the required error, is proved only for large n. 
Landsat image elements are described using four dimensions, and may have a true 
dimensionality of two, but even a two dimensional design has higher dimensionality 
than most designs, which are one dimensional pulse code modulation (PJM). Replication 
compression uses actual elements, rather than randomly selected elements, as rep- 
resenting elements. This is essentially equivalent to random coding, because the 
representing elements do not have a simple geometric structure. Signal sets with 
simply defined geometric structures are inefficient in approaching the random coding 

bound, because regions centered on such signals have systematic gaps and overlaps 

(17) 

in the packing or covering of n dimensional space. New representing elements 
are added when there is no element within the required error. This corrects for the 
effect of small n, at the cost of an increase in rate. When a representing element 

4r 

is used, it is indicated by a coded de' ignation, using U K bits, of one of the M = 2 
stored elements, where R is the rate per dimension or spectral band. 
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III. THE ERROR BOUMD AMD THE EXPECTED ERROR 


ORIGINAL PAGE IS 
OF POOR QUALilY 


In element replication compression, an image element is represented by a 
previously transmitted element only when it is within a predetermined distance of 
that previous element. The square of the distance bound is the mean-square error 
bound. Since some replicating elements will be at less than the bounding dist- 
ance, the expected mean-square error {^BE) is less than the VBE bound. 

Suppose that the distance boimd is d. After Sommerville Pl38) ^ 
surface content or volume of an n dimensional sphere is 

s . c 

n n 


C is a known function of n. 
n 


The volume of an n dimensional sphere of radius d is 


V 

n 



d 


n 


is also a knwn function of n. After Somerville, the volume can be obtained 
by integration over concentric n dimensional su'jrfaces, , as follows; 


V 

n 



n-1 


n 


dr 



C d“ 


C = n K 
n n 


To compute the expected mean-square error (^BE) , we assume that the elements 
represented by a stored element are uniform‘'y distributed over the volume of an 
n dimensional sphere of radius d, having the representing element as its center. 


E{^BE) = E(r^) = C (pdf of r) r^ (dV ) 

/a 


* 0 n 


^ 2 ^ n-1 

) r r dr 

n 


(C /v ) 
' n' n 


.n+2 


/ (r»^2) I 


4 


k 


i 
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E(NBE) a E(r^) = 

E(»eE) = 

In table I, the ^BE bound and experimental ^6E are shown for two images, Salton 
Sea and Bald Knob. The ratio of the experimental MSE to the MSE bound is about 
one-half, the calculated ratio for n = 2. The fraction of elements directly trans- 
mitted is also shown in the table, and is usually a few percent. Direct transmission 
causes no error, and the ratio of the experimental error to the bound is corrected 
accordingly. Because the two visible bands are hi^ly correlated and the two 
infrared bands are hi^h^ correlated, the true dimensionality of the data is more 
nearly two than four. At small ^BE, the dimensionality is higher, because the 
representing elements h'^ve neighbors in all possible directions. 

The ^BE can be further reduced by using the closest stored element to represent 
the current element, rather than the first representing element within the required 
distance. The number of distance computations is approximately doubled. The 
expected ^BE can not be precisely determined, but is estimated in the appendix, under 
the assumption that the number of alternate representing elements is such that the 
volume gaps between radius d spheres are equal to the volume overlaps ■ The resultant 
ratios of expected ^BE to the bound ^BE are given in table II, for n = 1,2,3, and U. 

In tt.ble III, the IBE bound and experimental I-BE are given for the case where the 
c?aest element in the table is used to repres it an element (if the distance is 
within the bound). The ^BE is sig .if icantly reduced, from about one-half to about 
one-third of the ^BE bound. The ratios of experimental to bound ^BE again are 
close to the ratios expected for n = 2, the approximate dimensionality of the image 
data. This result shews that replication compression is efficient in using spectral 
correlation. 

Because replication compression describes a picture element in all spectral bands, 
the rate, R, is given in bits per element in tables I and III. In sections II and VIII 
of this paper, the rate is given in bits per element per spectral band, in accordance 
with general usesge. For the Salton Sea image, the tabulated ^BE is in units of the 


n+8 


n+2 


MSE Bound 


( 1 ) 
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second least significant bit of the original eigjit bit data, ''-'’or the Bald Knob 
Image y the data is in units of the least significant bit of the original data. In 
section VIII, the data plotted for both images is in unit: oi*the least significant 
bit of the original data. 
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Table I. Mean-Square Error and Error Bounds. 
(First Element at Distance < d Used) 

Salton Sea Image 


Table Size 

IBE 

Experimental 

Exp. I£E 

Fraction 

Corrected Transmission 

in Bits 

Bound 

MSE 

IBE Bnd. 

Transmitted 

Ratio 

Rate, Tvro Methods 

6 

5 

2.60 

.52 

.102 

.58 

9.05/8.33 

5 

10 

5.01 

.50 

.053 

.53 

6.54/5.37 

5 

15 

7.3U 


.018 

.50 

5.52/3.95 

5 

20 

9.03 

.45 

.012 

.46 

5.34/3.52 

U 

30 

12.88 

.43 

.014 

.44 

4 . 04 / 2.95 

U 

40 

16.70 

.42 

.005 

.44 

4.15/2.57 

3 

60 

24.47 

.48 

.021 


3.56/2.47 



average 

.47 





Bald Knob 

Image 





Table ?i'xe 

^BE 

Experimental 

Exp. ^EE 

Fraction 

Corrected Transmission 

in Bits 

Bound 

MSE 

MSE Bnd. 

Transmitted 

Ratio 

Rate, Two Methods 

7 

5 

2.62 

.52 

.178 

.63 

12.52/12.75 

6 

10 

5.74 

.57 

.089 

.63 

8.66/8.36 

6 

15 

8.61 

.57 

.042 

.59 

7.26/6.55 

5 

20 

10.92 

.55 

.051 

.58 

6.43/5.95 

5 

30 

15.65 

.52 

.018 

.53 

5.53/4.42 

k 

40 

19.87 

.50 

.034 

.52 

4.95/4.11 

4 

60 

27.42 

.46 

.OH 

.47 

4.30/3.20 

3 

80 

31.9^ 

.40 

.032 

.41 

3.87/3 10 



average 

.51 


.55 
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Table II. Theoretical Ratios of Experimental MSE to NBE Bound. 




Estimated 

Estimated 

Dimension 

Patio = n/(n+2) 

Chaise 

Ratio 

n 

(First Element) 

Using Closest 

Using Closest 

1 

.33 

“ .22 

.U 

2 

.50 

-.19 

.31 

3 

.60 

H 

• 

1 

.42 

4 

.66 

-.17 

.^9 
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Table III. Mean-Square Error and Error Bounds. 
(Closest Element at Distance < d Used) 

Salton Sea Image 


'able Size 

)6E 

“Kperimental 

Exp. ICE 

Fraction 

Corrected Transmission 

in Bits 

Bound 

MSE 

}£E Bnd. 

Transmitted 

Ratio 

Rate, Two Methods 

6 

5 

2.05 

.41 

.097 

.45 

8.91/8.26 

5 

10 

3.84 

.38 

.051 

.40 

6.47/5.39 

r 

> 

15 

5.21 

.35 

.017 

.36 

5.50A.07 

5 

20 

6.06 

.30 

.012 

.31 

5.36/3.77 

k 

30 

8.75 

.29 

.013 

.30 

4.38/2.97 

k 

40 

10.46 

.26 

.006 

.26 

4.18/2.64 

3 

60 

18.48 

.31 

.016 

.31 

3.42/2.29 



average 

.33 


.34 



Bald Knob 

Image 





Table Size 

MSE 

Experimental Exp. I-BE 

Fraction 

Corrected Transmission 

in Bits 

Bound 

^BE 

^EE Bnd. Transmitted 

Ratio 

Rate, Two Methods 

7 

5 

2.00 

.40 

.171 

.48 

12.31/12.64 

6 

10 

4.12 

.41 

.085 

.45 

8.54/8.41 

6 

15 

5.88 

.39 

.039 

.41 

7.18/6.62 

5 

20 

7.17 

.36 

.050 

.38 

6.44/5.89 

5 

30 

10.23 

.34 

.016 

.35 

5.46/4.62 

U 

40 

11.88 

.30 

.035 

.31 

4.97/4.19 

4 

60 

17.19 

.29 

.013 

.29 

4.36/3.29 

3 

80 

22.63 

.28 

.027 

.29 

3.75/2.92 



average 

.35 


.37 
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IV. THE TABLE SIZE AHD TEE TBASSMISSIOII RATE 


ORlGiNAi- F-'o- ’.'5 

OF POOR 


<Qie auunber of bits required to indicate one of the representing eleoents 

stored in the table is a larer bound the transmission rate per element. If the 

number of bits used to indicate a table location is L, and if one location indicator 

L 

is reserved to indicate direct transmission of an element, the table size is 2 •!. 
Suppose that the fraction of elements directly transmitted is f (L) , a function of L. 
If the number of bits needed to directly describe an element is K, and if each 
directly transmitted element is inserted into the table at a location designated 
by L bits, the average transmission rate R(L), in bits per element, is 

R(L) = (l-f(L)) L + f(L) (2LfM) 

= L + f(L) (LfM) (2) 

The average rate is greater than L by an ac"cunt depending on f (L) - As L and the 
table size increase, f(L) decreases. The rate is minimum at some value of L, 
depending on the image correlation and the I-SE bound. The above equation shows 
that if L is increased by one, rate is re uced if f(L) (L<-M) is reduced by more 
than one, or if f(L) is less by l/(L+M). 

The ninimum value of R can be found, if the function f(L) ia known. If L 
is zero, all elements are directly transmitted: f(0) =1. As L becomes large, 
repeated transmission of similar elements is avoided, but some minimum number of 
elements must be transmitted to represent the data within the required error. 

Suppose that f(L) :'s larger than this ninimum, because of repeated transmissions 
of similar elements. If L is increased by one, twice as many of the required 
elements car. be stored, and the number of repeated transmissions of required ele- 
ments can be reduced by one-half. This i'_,lies that f(L) = 2 , for f(L) greater 

than the minimum. f(L) was measured for a wide range of L, at two NBE bounds. 

The results, plotted in figure l,are proportional to 2~^ , until L becomes large 
and f(L) becomes small. (The decrease in f(L) for large L indicates that the 
table management is not optimum.) 
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Figure 1: The fraction of elements directly transmitted, f(L), versus 

the number of bits to indicate table location, L. 

O SALTON SEA (USE BOUND « 10 

□ SALTON SEA MSE BOUND - 20 

OBALD knob MSE BOUND > 10 


ORIGINAL PAQZ IS 
OF POOR QUALITY 



*J>\ 



Of^lGlNAL 

^ POOR 

We substitute f(L) » 2"^ in the equsticai for R 

R(L) = L + f (L) (Irt-M) 

»L + 2“^ (Lm) 

The niniaun rate is found by setting the dnivatire of R(L) equal to z^o. 

= 0 = 1- 2*^(ln 2) (Ih^M) + 2’^ 

®^^^min " ^ (In 2)(Irf-M) - 1 

* ^ In 2 - l/(LfM) ( 3 ) 

+ 1.5 

The first data given in the transmission rate column of tables I and III correspond to 

the method described above. Data is given (Aly for the L giving the minimum R(L) , 

and R(L) ranges from L + 0.04 to L + 3«05» except for the cases where L equals ?• 

The larger values of B - L occur for small where the miniouim f (L) is relatively 

large. The table size could be made adaptive, by increasing L when f(L) is larger 

than the value for R given above. 

min 


PAG' {'i 
QUALfTY 
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¥. THE EBRCR BCKJHD AM) THE TABLE SIZE 


ORIG’.NAL ^ 

OF POOR QUAl.rTY 


The table size for ntininumi rate increases as the error bound decreases. A 
given distance or mean-square error bound defines an n dimensioial sphere, which 
contains those elements that can be reinresented by the sphere center. Hie sjftieres 
idiose centers are the directly transmitted elements c(»itain all the original data 
elements. For compression at the minimum transmission rate, a large fraction of 
the elements to be transmitted must be within the error bound of a currently stored 
element. If a fixed volume must be filled by spheres, and if the volume of each 
sphere is decreased by decreasing the tCE bound, the number of the spheres, and 
the number of centers stored, must be increased. 

The volume of the region represented by a transmitted element is 

V =CK d“ = CK (FEE Bound)“^^ 
e n n ' ' 

C is a constant less than one, which allovs for the overlap of spheres. For a 

table indexed using L bits, the total volume represented by stored elements is 

= 2^ C K (NBE Bound) 
t n 

Solving for L, 

2^ = (V./C K ) (ICE Bound) 

L = -(d/2) logg(®E Bound) + log^CV^/C K^) (4) 

Since ^BE is proportional to the MSE bound, 

L = -(n/2) log2^BE + a constant 

If ^BE is reduced by 6 dB, to one-fourth its former value, L must increase n bits, 

or one bit per dimension. If the effective dimensionality of Lands at data is two, 

reducing NBE by one-half requires increasing L by one. 
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The values of L used in tables I and III were selected for minimum rate at each 
paarticular error bound. The data of these tables is plotted in figure 2, giving 7 ~ 
lo^tCE versus L. The figure is complicated because the same L is optimum for 
several similar MSE bounds. Solid lines connect the upper points in each set, and 
dashed lines the lower. The data are in good agreement with the relation found here. 

The bound directly determines the ICE, and influences the rate by determin- 
ing the table size and L for minimum rate, to within a ccwistant. 
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Flgiire 2: The experimental MSE versus the nuaher 

of bits to indicate table location^ L* 

O SALTON SEA - FJRST ELEMENT (TABLE I) 

□ SALTON SEA - CLOSEST ELEMENT (TABLE III) 
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VI. TABLE mmsmsa 


Some system must be used for deleting and entering neir elements in the table 
of replicating elements. The s inkiest system is to insert the first transmitted 
elements into the table in order of transmission, and to replace them in the same 
order after the table is filled. Althou^ this system is workable, the method 
adopted was to recOTd the number of uses of each table element, and to replace the 
least used elements first. The scan method transmits the elements of a 2^ element 
line in order, then begins the next line. The table is sorted according to useage 
after one-fourth of the table has been replaced, and also at the end of each line 
of elements. After the sorting at the end of each line, the useage counts are 
divided by two, to gradually eliminate elements no longer needed. Replacing the 
table elements according to useage, instead of in arbitrary order, results in small 
decreases in transodssion rate and in MSB, boxh about ten percent. 

In equation 2 of section IV above, 2 L + M bits per element are used when an 
element is directly transmitted. L bits indicate that the current element is to 
be directly transmitted, M bits describe the element, and a second L bits indicate 
the table location where ttie transmitted element is to be stored. If the algorithm 
determining the table location is applied at the receiver as well as at the trans- 
mitter, the L bits defining the element to be replaced are not required. Hcwever, 
the average rate is not much affected by this change. Equation 2 becomes 

R(L) = L -I- f(L) M 


Equation 3 becomes 


R(L) 


. = L + 1/ In 2 

min ' 


The experimental rates were obtained using 2 L + M direct transmission bits, but 
the information in tables I and III is ufficient to allcw calculation of the rate 
for L + M direct transmission bits. 
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It was noted in section IV that the decreasing f (L) for large L, shown in 
figure 1, Indicates that the table management is not optimum* If the most useful 
2^ elements were always retained in the table, f(L) would filways decrease as 
L increases. However, it is difficult to select correctly the elements that will 
be needed to replicate future input elements. Perhaps the ordering of elements 
according to useage should be more frequent for larger L. VHien table size is 
large, it may happen that several stored elements are used to represent elements 
in a region that one element would represent in a small table. As the useage 
counts for the several elements are smaller, they all may be replaced when one 
stored element with the total useage count would be retained. Since f(L) is quite 
small for large L, little rate reduction can be gained by improving table management, 
and no further experiments were made. 
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VII. COMPRESSION USING SPATIAL CORRELATION 
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Element replication compression is primarily spectral, but the elements 
contained in the table change as the insage is scanned. Any spatial proximity of 
similar elements Increases the probability that a useable replication element will 
be in the table. It is also possible to indicate recently used elements by a few 
bits, so that spatial correlation would reduce the average transmission rate. The 
simplest method, which is used here, is to use a one bit word to indicate that a 
replicating element is the same as the last one used, and to prefix all the L bit 
table designations by the complementary one bit word. 

If the probability that the previous replicating element is used again is p, the 
rate for this method is 


R^ = p + (l-p-f(L))(LH) + f(L) (2Irf^l+M) 

The rate for the previously described system is, from equation 2, 

R = (l-f(L)) L + f(L) (2LfM) 

R-Rj^ = -p + pL - (l-p-f(L)) - f(L) 

= pL -1 

The new method achieves a reduction in rate if pL > 1. At the lower transmission 
rates, L is 3» or 5, and p is l/2 or more, so that the transmission rate is 
reduced about one bit per element. Data obtained using this method is given in 
tables I and III, as the second number in the transmission rate column. The gain 
for large table sizes is small, since p is small. 

If a one bit word indicates the last table entry used, and a two bit word 
indicates some other entry with probability q of being used, the L bit table 
designations must be prefixed by the complementary' two bit word. The rate for 
the two indicator method is 
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Rg » p + 2q + (l-p-q-f(L))(L+2) + f(L)(2L+2+M) 

- Rg = -2q + q(L+l) - (l-p-q-f(L)) - f(L) 
a q Ii 1 + p 

There is a further reduction in rate due to the second special designation, if 
q > (1-p)/ L . For L = 5 and p * l/2, q must be greater than l/lO; if q is 
1 / 5 , the rate reduction is l/2 bit per elxment. No experiments were made using 
such second special designations. 
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VIII. IMELEMEMTAIION AND RELATIVE EERFOBMAIJCE OF REPLICATION C(MPRESSION 


Multlspectral image compression was In^lemented as a computer program. In the 
following manner ; 

1) The mean-square error bound is selected. 

2) The stored element table size, M = 2“ - 1 is selected. At the beginn ing 
of each image, the table contains all zero elements. 

3) For each element to be transmitted, distance was measured to all th- 
elements stored in the table. 

4a) If a stored element is within the WBE bound, the table designation of 
the closest element is transmitted, using L bits. 

4b) Alternately, if the table designation is the same as that last used, a 
one bit word is transmitted. If the table designation is different, the correct 
L bit table designation, prefixed by the complementary one bit word, is transmitted 

5) If no stored element is within the MSE bound, the reserved indicator word 
of L bits (or L+1 bits per 4b) is transmitted, followed by M bits describing the 
element, and L bits indicating the table position for its storage. 

6) After one-quarter of the table elements have been replaced, and also at 
the end of each line, the stored elements are ordered according to useage. In 5> 
the least used will be replaced first. The useage counts are reduced by one-half, 
after each end-of-line ordering, to reduce the effect of spatially distant elements 

The rate and t<BE results for two images have been given in tables I and III. 

In table I, the first stored element at distance less than the bound was used to 
represent the compressed element. In table III, the closest stored element at 
distance less than the bound was used. In both tables, the first number given 
for transmission rate is for the method not us’ng a special repeat designation (4a) 
and the second rate number is for the method using the repeat designation (4b) . 

The rate and ^BE data of table III is plotted in figures 3 and 4. The plotted rate 
is in bits per element per spectral band. The mean-square error is given in units 
per band, where an error of one corresponds to an error in the least significant 
bit of the ei^t bit data. Previous results from several sources are shown for 
comparison. 
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Figure 3: Rate versus distortion for the Salt<» Sea image 
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MEAN SQUARE ERROR PER BAND 


RATE BITS/PEL/BAND 
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Figure U: Rate versus distortion for the Bald Knob image. 
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O REPLICATION COMPRESSION 

^ REPLICATION COMPRESSION 
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For the Salton Sea image, figvcre 3» the results of two dimensional cosine ^ 

(19) 

and two dimensional Hadamard transform compression are shown. Each band is 
processed independently , The KL-Hadamard-DPCM compression usei a Karhunen-Loeve 
spectral transform, followed by a' horizontal Hadamard transform and vertical 
differencing in each transformed band.^^^ These three methods use a fixed rate 
for each element; the other methods shown all have varying rate. The adaptive 
cosine transform cooqpression is two dimensional spatial compression, using 
different transform vector bit assignments and quantizations in regions of different 
spatial detail. The adaptive cluster coding groups data elements into spectrally 
similar clusters, merges clusters that have centroids within a predefined distance, 

(5) 

and entropy codes the class designations. For the Salton Sea image, replication 
compression and adaptive cosine compression give the best results. 

For the Bald Knob image, figure U, two dimensional Hadamard conpression is the 
cnly non-adaptive technique. Two adaptive three dimensional transform tech- 
niques were used with similar results. In the adaptive Haar Hadamard method, a 

Haar transform in the spectral domain was followed by an adaptive, variable rate 

(5) 

two dimensional Hadamard transform.' ' The three dimensional Hadamard transform 
adapted to spectral differences, using a fixed rate.^^^^ For the Bald Knob image, 
adaptive cluster coding gave the best results, followed by replication compression 
with a repeat indicator. 

Figure 5 shews the originals of bands 1 and U of the Salton Sea image, and the 
same compressed using replication compression. This compression test used L equal 
to and an ^BE bound of 10. Using the closest stored element and a repeat indicator, 
the transmission rate is 5*39 bits per element, or 1.35 bits per element per band. 

The average I>BE per band, in units of the least significant bit of the original eight 
bit data, is 4.3^. 

Figure 6 shews original and compressed bands 1 and 4 of the Bald Knob image, 
lor this compression test, L equalled 5» and the I-BE bound was 20. Using the closest 
stored element and the repeat indicator, the rate is 5*89 bits per element, or 1.47 
bits per element per band. The average ^BE per band, in terms of the least signif- 
icant bit of the original data, is 1.79 • 
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Original, band U 


Compressed, band U 


Figure 3: Salton Sea image, original and compressed bands 1 and U, 1,35 bits 
per element per band and U.3U average ^6E per band. 
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Original, band U Compressed, band U 

Figure 6: Bald Knob image, original and compress'.d bands 1 and 4, 1.47 
bits per element per band and 1.79 average NBE per band. 
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Hie Saltern Sea image has 2U0 elements and 248 horizenital lines. The Bald 
&i6b image has 176 elements and 2^ lines. The indieridaal picture elements are 

e 

visible I . the images, because each element was duplicated six times in both the 
vert Lea.' and horizontal directions. The method of replication compressiem 
preserves high contrast spatial detail, but introduces false contouring in lav 
contrast areas. 
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XI. REPLICATION COMPRESSION AND CLUSTER CODING 


The above results shew that selective replication, with a repeat indicator, 
gives perfomfic. s coeqsarable to cluster coding. Both methods use spectral com* 
press ion and a table entry indication. The tiro techniqu :s will be cooqiared in 
computation requirement, maximum error, subjective image appearance, and spatial 
compression characteristics. 

In replication coaqoression, each input element requires <xie distance cooptation 
for each stored table element. In cluster coding, the initial clusters are formed 
cooqsutinc the distances between elements and Joining the closest. The centroids 
of the clusters are computed, and the distances between them are computed to determine 
which clusters can be merged. To reduce the computation requirement, clusters are 
formed over two dimensional subpictures, which were 16 by 16 for the results referred 
to above. The 256 elements have about 128 distances each, which corresponds to the 
distance computation for replication compression using a seven bit table. The typical 
table size of five bits requires one-quarter as much computation. After the initial 
distance computation, cluster coding requires many more computations, depending 
on the image data. 

In replication compression, the maximum error is defined by an input 

parameter, and is typically three times the average error. In cluster coding, the 

limited number of clusters may cause large error. Suppose we are restricted to 

sixteen initial classes in four dimensions. The sixteen centroids with maximum 

separation in four dimensions are the vertices of the four dimensional hypercube. 

These are at (±A,±A,±A,±A), where A is the maximum amplitude in each dimension. 

Suppose an element occurs at the center (0,0, 0,0), and must be merged with a cluster 

centered at one of the vertices. If there are many elements in that cluster, the 

new point will change the centroid little, and the error in representing the center 
2 

point is U A . This error is equal to the square of the largest possible element 
value. This example is extreme, and most errors are small, but the Diaximum error 
in cluster coding is data dependant, and may be very large. 

As cluster coding and replication compression are both spectral methods, the 
subjective appearance of the compressea images is similar, and avoids tne spatial 
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blurring introduced by predictive and transform compression at comparable rates. 
Hovever, because cluster coding uses two dimensional subpictures, the subpicture 
edges are visible in images cooqpressed at low rates 

Replication compression makes use of spatial correlation by placing current 
elements in the stored table, and by indicating repeated replication elements by 
a (sie bit word. Adaptive clustering makes use of spatial correlation by forming 
local clusters, which are few (reducing rate) and spectrally similar (reducing 
error) when spatial correlation is hi^. Because of the variable number of 
clusters, cluster coding can achieve a low rate in areas of hi^ correlation. 
Because of the entropy coding of cluster indicators, the occurance of a few unusual 
elements does not significantly increase rate. Replication compression could be 
similarly improved by making the list size adaptive and by entropy coding the list 
indicators. The replication algorithm used here uses only past elements and the 
current element, and the gain of adaptive list size and entropy coding would be 
increased if the replication algorithm examined future elements before compression, 
as does cluster coding. 
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X. C(»ICLUSION 


A method has been developed for multispectral image compression, using 
replication of stored elements with an error bound. The method has been theoretically 
described, and experimental performance has been compared with other compression 
techniques. The average mean-square error for two Landsat images is about one-third 
of the error bound, as expected from theoretical ccxxsiderations. For 2^-1 stored 
elements, and for the L giving minimum rate, the transmission rate is about L -i- 1 . 3 « 

To reduce the mean-square error to one-fourth its value, L and the transmission 
rate per element must increase about one bit per effective dimension. Performance 
is improved slightly by having the stored elements include more frequently used and 
more recently used elements. Having a one word indicator for a repeated use of 
the same element gives significant rate reduction. The rate and mean-square error 
performance of replication compression is superior to that of most previously 
exatuined techniques, and is similar to that of adaptive cluster coding. The imple- 
mentation of replication compression is simple, and is similar to the random coding, 
table lock-up method shown to achieve the rate-distortion bound. 
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AFI^IX. EX£BCTED MEAH-PQUARE ERROR WHEN THE CLOSEST STORED ELEMENT IS USED 


Mean-square error can l>e found by Integration over the appropriate n dimensional 
volume. Initially, we consider the case where the distance bound is d, and there 
are two elements at distance 2b < 2d, as shewn in figure A.l. The expected MSE 
contribution to the left hand sjAiere, due to the shaded cap, is 


E(M5E) 




r(d -2 ; ' f 

^ L h 


.2 2 2 . 1/2 

(d -y -z ) 

(1/V, 


n 


(r^+y^ ^ 


The *.hree terms in the integral are the probability density function of r, the 
measure of squared error, and the volume element. Using the closest element 
transfers the points in the shaded cap to the ri^t hand element, and the expected 
^SE is 


E(f6E)^= 


/ ^ / 

-»b 'O 



)(C 


n-2 


^dr dy dz) 


change * 


./% ' 7 

/b *0 *0 


2 2,1/2 2 2 2,1/2 

-z ) /-(d -y -z ) ' 


(l/V ) 4b(b-z) (C r*^"^dr dy dz) 


2 2 

After integration using y = (d -z ) ' sin s, and z = d cos t. 


E(KBE) 


coa-hh/d) _ „ ‘rt’Vl 


. = f ,(4b^K ,/K ) sin°t dt - 

change ' n-1 n' 

= F(n,b,d) 


6T (n+1) 


For small n, this equation is readily integrated. 
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We have found the reduction in one element representing region, due to 

one nel^borlng sphere at distance 2b. The oq)ected I6E reduction for an element 

representing region depends on the nuoiber of neighboring elements at different 

distances. To estimate this number, we assume that the volume of neighboring spheres 

exactly equals the local volume available, including the volume of the original 

sphere. That Is, the volume of the gaps between spheres Is assumed equal to the 

cnrerlapping sphere volume. As shewn In figure A.2, an element at distance 2b from 

2 2 X/2 

another element has more than one-half its volume within distance (4b -t-d ) ' of 

that element. The number of spheres at distance 2b is then 


N(b) = 


V d“ 


= + 1 )“/^ 


The expected ^BE is then 


E(NBE) = ( F(n,b,d) N(b) db 

Vo 


The values obtained from this integration are given in table II, in the text, for 
dimension n = 1,2,3, and 4. 
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IUrRQDUCTION 

!I3ils report describes the current state of Landsat data classification by 
cooputer, and considers possible areas of research for NASA-ARC. The fundamental 
{nroblem is obtaining useful discrimination of various phenomena, given the spectral, 
spatial, and temporal characteristics of Landsat data. Improved accuracy sometimes 
justifies the cost of more sophisticated computer processing, and different class- 
ification methods provide alternate approaches for special problems. 

LAHDSAT IMACS SYSTEMS 

The classification of Landsat images is one of many operations performed by 
a computer image processing system. JFL's VICAR, ESL's IDI^S, GE's Image 100, and 
Purdue's LARS are well known. The functions usually performed by these systems are 
as foUovs; 1- input (tape to disc, reformat, data check), 2-mosaic (merge image files), 
3-clean up (remove bad lines, banding), 4-statistics (histograms, ratios, classifier 
parameters), 5-cnhancement (smoothing, edge emphasis), 6-display (false color, area 
enlargement, contrast stretch), 7-clustering (classification centers), 8-classifica- 
tion (training samples, maximum liklihood, statistics), 9 - mapping (cross correlation, 
overlay, registration), 10- output (data tape, ihoto copy). The most direct way to 
investigate classification methods using the SEL 32, would be to obtain IDDC or 
EDITOR format, cleaned up, 8 OO cpi data tapes, and program the SEL 32 to display, 
cluster, and classify such input data. If useful classifiers are developed, their 
results could be made compatible with the IDD«B or EDITOR output functions. The 
alternate approach would be to develop a complete image system for the SEL 32, which 
would require extensive time and effort. 

FAMILIAR LAriDSAT DATA CLASSIFIERS 

The most widely used Landsat classifier assumes that each class has a normal multi- 
variate distribution, and uses the maximum liklihood decision rule. The training 
samples for each class are described by a vector of four spectral values. The train- 
ing samples are used to compute the mean and variance of each class, which completely 
define the normal multivariate distribution. For each unknown sample, the probability 
that the sample is a member of each class is computed from the distribution and the 
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a priori class probability, and the class with the hipest probability is selected. 

This classifier requires the computation of quadratic forms. Classification of 

I 

a full Landsat frame requires about 90 hours on the XDB6 HP-3000, 2 hours on an 
array processor, and about 20 minutes on the ILLIAC IV. Several methods have been 
used to increase the speed of the normal distribution, maximum llklihood classifier, 
including table look up (11, 17 p370, 19 p755), sequential decision trees (I9 p778), 
linear approximation to the quadratic discriminant (I9 p780), and canonical sucessive 
approximations to the quadratic discriminant (7) . 

If all the classes have the same form of distribution (ie., the same a priori 
probability and covariance matrix) a maximum llklihood classifier uses a linear 
rather than a quadratic discriminant (5 p29) . If the four spectral values are also 
independant and have equal variances, the correct ma..imuil llklihood decision rule 
is to assign a sample to the class having the closest mean or centroid vector (5 p27)« 
These classifiers are faster than the classifier for the general normal multivariate 
distribution, but they are usually less accurate, because their assumptions are less 
realistic. Because these classifiers assume the form of the class distribution, and 
compute the distribution parameters from the training samples, they are termed para- 
metric classifiers. 

Clustering is similar to classification, but uses a different approach. Maximum 
liklihood ''lassification is supervised, by using training samples. Clustering is 
unsupervised, and examines only the data. Clustering methods define the inherent 
spectral data groupings, using distance neasures and group combining and splitting. 

If the data clusters are assigned to classes, the process is similar to minimum 
distance to centroid classification, which is less accurate than normal distribution 
maximum liklihood classification. 

It is obvious that multimodal class distributions will cause errors to be made 
by the normal distribution maximum liklihood classifier. Clustering is usuaiJLy used 
to split the classes into spectrally homogeneous and separable subclasses, which 
can be used to train the normal distribution maximum liklihood classifier. After 
classification, the subclasses are recombined into the original classes. 
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HCMPARAMETRIC CLASSIFIERS 

Because the aiultlspectral data does net fit the normal distribution model (I 9 P?80), 
nonparametric classifiers are sometimes used. A direct approach to nonparametric 
classification is to use the training samples to estimate the class probability 
distributions, and to perfearm maximum liklihood classification using these empirical 
distributions. Density estimation requires more computation than normal maximum 
liklihood classification, but some reductions are possible (19 p778). One exper- 
iment found that estimated probability distributions gave no significant improvement 
in classification accuracy (I 9 pT 80 ); another test found classification accuracy 
improved from 95 -1^ to 1005 & using four features (10, 3 pl56-l£2, 19 p 78 U). 

The nearest nei^bor classifier is an important nonparametric method. All the 
training samples are stored, the distance from an unclassified sample to each of the 
training samples is computed, and the unclassified sample is assigned to the class 
of its nearest spectral nei^bor. Cover (4) has shown that the error probabili.ty of 
the nearest nei^bor classification is always less than twice the error probability 
of maximum liklihood classification using the correct probability distribution. The 
computation requirement for many training samples is hi^, but the number of distance 
calculations can be reduced. The training samples can be condensed to those near the 
boundaries of the class regions, which are sufficient to aefine the class discrimina- 
tion (12). The training samples can be ordered by magnitude in one spectral band, 
so that training samples far from the unknown sample in that b*\nd don't require a 
distance computation (6). 

If the parametric assumption is the source of classification error, nonparametric 
methods will improve accuracy. However, when clustering is used to divide the classes 
into homogeneous and separable subclasses for normal maximum liklihood classification, 
the resulting class distributions can be very similar to the empirical distributions. 

The interactive process is "frequently subjective" (I 9 p733) and determined by 
'artistic liscense" (I 9 p732). This requires a knowledgeable and talented system 
user. The use of nonparametric classifiers may not improve classification results, 
but can shift effort from the user to the computer, and allow less experienced users 
to obtain good results. 
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SPATIAL-SPECTRAL CLflSSIFICAIION 

The classifiers described above operate independaatly on each sample, using 
only spectral characteristics. Human image interpreters make use of context, 
edges, texture, and gyey scale, in that order of in^ortance (1? pp380, 40U). 

Humans readily abstract spatial relations, but find it difficult to discriminate 
using grey levels in maiiy bands; the opposite is true of computer algorithms. 

The identification of spatial patterns is the objective of character analysis, 
of co«i5>uter scene analysis, and of some areas of remote sensing, such as detection 
of roads or geologic features. Texture analysis of multispectral scanner data has 
been implemented by transforming spatial data to the frequency domain (1? p4o4) , and 
by deriving texture measures from simultaneous hi^er resolution data (Bryant JPL) . 

A simple method for using spatial context is to increase the probability estimates 
for the classes occuring in spatial nei^bors. Spatial-spectral clustering assigns 
samples to the same class only if they are both spatially connected and spectrally 
similar (I9 p78l, 17 p40l) . Pearson of ERL and others select samples for preliminary 
clustering only if they are part of spectrally similar spatial blocks. 

The best known method of spatial-spectral classification is per-field classifi- 
cation (19 p785, 17 P37l), in which samples in one agricultural field are all given 
the same classification. The fields are defined by a boundary drawing algorithm or 
by spatial-spectral clustering. The per-field technique gives improved classification 
accuracy of crop type. Fu found 92 •6'>o accuracy versus 79.7% accuracy for the normal 
maximum liklihood classifier (9 plO) • The classified images have much less random 
classification noise than images produced ignoring spatial considerations, and 
appear more like the product of human drafting. The total computation requirement 
of per-field classification is reduced, because each field rather than each pixel 
is classified (15). 


ERROR ESIi:-!ATION 

The quality of a classifier is determined by its error performance. There are 
two common methods of error estimation, resubstitution and cross-validation (20, 9). 
In resubstitution, the original training samples are classified, and the resultant 
error is taken as an estimate of the operational classifier error. A low 
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resubstitution error shotfs that no gross mistakes have been utade in the classifier 
design, but the resubstitution error is usually optimistic. This error estimate 
does not indicate how well a classifier generalizes from the tVaining samx^es. In 
cross-validation methods, different subsets of the training samples are held out 
of the classifier design, and the classifier is tested on these unused subsets. 

More training samples provide both better classifier quality, and a better 
estimate of quality. However, the classifier error can be estimated even when no 
ground truth is available (13) • Training samples are identified by clustering, 
spectral signature, or multidate imagery. The classifier error can be estimated 
from the distributions of the classified image samples. A simpler error estimate 
accepts the majority classification of each agricultural field as correct, and 
counts the pixels with other classification as errors. 

LAUDS AT 3 

The recently launched Landsat 3 has a new band in the thermal far infrared 
region. Its resolution is 240 meters, rather than 60 meters as in the other bands. 

A thermal band has been found useful for vegetation classification, crop stress 
(17 p337, Millard ARC), and urban land use classification during the winter months 
(2). Classifiers can use the thermal band as a fifth dimension in computing cluster- 
ing and classification distance parameters. Since the thermal band is we3.1 separated 
in spectrum from the other bands, and since it measures heat emmission rather than 
light reflectance, it provides increased separability of classes. Processing five 
band data will increase computer time, and perhaps some attention should be given 
to the feature selection problem: the selection, combination, and relative weighting 
of the different bands (features) for optimum class separability. 

The second sensor on Landsat is the return beam videcon (RBV). In Landsat 1 
and 2, the RBV consists of three television cameras imaging the same area as the 
multispectral scanner, in three spectral bands. The RBV provides greater geographic 
fidelity, but much less radiometric accuracy than the ^ES. Because of tape recorder 
problems, relatively little RBV data has been acquired. In Landsat 3» the RBV 
consists of two television cameras, each providing a monochrome image of one-half the 
^BS field of view. The RBV resolution has been increased to about 40 meters. 
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The BBV data should be the primary source for applications which emphasize 
spatial information, and the MSS data .should be used for spectral cleisslflcatlon. 
Combining data from, both sources requires mutual registration. The conbiued data 
could have the larger number of pixels of the BBV data, and six measured values per 
pixel - the five ^BS bands and the BBV panchromatic value. Another approach 
would be to use the BBV data as sub-pixel texture information added to each 
pixel. The combined data could be useful for spatial-spectral classification. 

COMCLUSION 

The normal distribution maximum liklihood classifier is standard in Landsat 
data anal^'sis, but nonparametric or spatial-spectral classifiers sometimes provide 
useful improvements in accuracy. Because classifiers require large amounts of 
computer time, faster implementations are important. Extensive research in these 
areas is reported in the pattern recognition and image processing literature, but 
the technique.^ are not available in commercial software systems. It would be 
useful to review the methods described, and to evaluate the more promissing 
exper imentally . 

There are three components in an interactive image system; computer processing, 
human skill, and the man-machine interface. An advance in any component would be 
very valuable, but wide scale, low cost useage of Landsat data depends on transferring 
more of the burden to computer processing. It may be possible to substitute non- 
parametric classification for the interactive process of selecting and clustering 
training samples for the normal distrioution maximum liklihood classifier. Si>atial- 
spectral classification can produce less noisy classification maps, or detect 
spatial features, without human intervention. These techniques could increase the 
effectiveness and satisfaction of image system users. 

Important areas for future worx at NASA-rBC are: 

1. development of a nonparametric classifier to reduce human interaction, 

2. development of spatial algorithms to improve classification accuracy, 

3. development of classifiers using the new bands of Landsat 3. 
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I. lOTRODUCTION 


Th3 filling of tne allocated spectrum and tne use of hi^er power satellites 
have led to consideration of bandwidtli efficient modulation techniques for 
satellite conmunications . Riase shift keying (ISK) has been replaced by 
quadrature phase shift keying (QPSK) in some applications, and multiple pnase 
snift keying (MPSK) and two and four dimensional amplitude-pnase shift keying 
(AfSK) are being investigated. Although MPSK and APSK have not been 

implemented for satellite communication, they have been used in telephone modems 
for several years. ^ This paper examines the theoretical basis of higher dimen- 
sional "'-^ulation, and describes two new classes of four dimensional designs tnat 
are more efficient tnan familiar APSK modulation techniques . 

Shannon's well-known capacity bound defines the trade-off in bandwidth 
efficient modulation, and limits tne potential gain of higher dimensional signal 
design.^ Shannon proved tnat the largest possible number of error free messages, 
M, that can be transmitted over a communications channel is 


M.(^; 


TW 


( 1 ) 


P is tne signal power, T is the signal duration, W is tne signal banchridth, and 
N is the noise pcwer in the signal bandwidth, W. M messages can be transmitted 
without error if the product TW is allowed to become infinitely large, and the 

7 

actual probability of error decreases exponentially as TW Increases. If the 
channel capacity, C, is defined as th^maximum rate, in bits per second, that 
error free information can be tr»'nsmitted over a channel. 


C = (1/T) logg M = W logg (^) 


( 2 ) 


Tile channel capacity can be doubled eitner by doubling the bandwidth, W, or 


1 





ty increasing the signal power, F, so that the slgnal-to-noise ratio (Pi-N)/N 
is squared. 

The dimensionality, D, of the message set is the number cx basis functionr 

required to represent the message set. The two best known kinds of basis 

functions are the Fourier series sinusoids in the frequency domain, and the sampling 

(sin x)/x pulses in the time domain. The number of basis functions required 

6 B 

to represent a message set, the dimensionality, is »qual to about 2TW. * 

A continuous th: ee minute telephone message has 10^ dimensions, and a one hour 
television transmission has 10^^ dimensions. Using large signal set dimension 
requires long delays and higb information storage. Because of its simple 
implementation, the most common digital communications signal is the PSK pulse, 
with duration, T, approximately equal to 1/2W, so that its dimensionality is one. 

The purpose of bandwidth efficient modulation is to increase the transmission 
rate without increasing channel bandwidth, by using more signal power. To 

y 

emphasize power, we express tne Shannon bound as capacity per dimension, C^. 

The noise power spectral density is , where = N/W, and the signal energy 
is E = FT. 
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Cp = (1/2TW) loggM = C/2W 

= (1/2) logg (^) = (1/2) log, 

= (1/2) logg ( 1 + 2 Ejj / Nq ) (3) 

The energy per dimension is = E/d = E/2TW. 
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II. CONVENTIONAL MODULATTON LESIONS 

Coonerclal tele^dione modems have transmission rates from a few hand to 
1,200, 2,400, or 4, 800 bits per second (bps). For simplicity, the modems 
ar<^ designed in few dimensions. The baud rate of a transmission system is the 
number of time division pulses per second, and is usually considerably less than 
twice the highest useable baseband frequency. For the voice grade tele^one 
channel, the baud rate is 2,400, while 2W is 6,600. Modems use signals designed 
for a single baud, containing only one or two dimensions. 

The types of modem signal design® discussed by Davey are described in 
Table I. Two one dimerisional designs, PSK and four level (L=4) amplitude 
modulation (AM) are added for comparison. The lowest rate system is binary 
frequency shift keying (FSK), which has a rate of one-half bit per dimension. 

One of two orthogonal frequencies is transmitted in each baud. Since the 
frequencies are not transmitted independantly, the design is two dimensional. 

The receiver detects each frequency independantly, then selects the most likely. 
Table I shows the signal geometry for each modulation method, indicates the 
design dimensionality, and gives the transmission rate per baud and par dimension. 
At 2,400 bps, four phase QPSK is standard. Since only one of the four qoadrat’ore 
signals is transmitted, the dimensions are not independent, and tho design is 
two dimensional. The receiver detects each phase independantly, and uses ooth 
results to decide which phase wa« transmitted. 

In addition to ^?SK, there are several types of modulation used at 4,800 bps 
or proposed for higher rates. The two level (L=2) APSK (sometimes implemented as 
vestigial side band) is a one dimensional system, althou^ it is geometrically 
equivalent to QPSK except for a 45 degree rotation. Instead of one of two 
possible orthogonal signals being transmitted, as in QPSK, bouh of the orthogonal 
signals are transmitted, and each is received and decoded independantly. All of 
the multi-level APSK designs are one dimensional, as the encoding and decoding 
of each quadrature phase is independant- APSK designs are used because their 

2 

one dimensional structure permits easy implementation of hi^ speed receivers. 

The eight phase MPSK systems are two dimensional. It is evident from the signal 
geometry uhat the higher rate designs require more signal power for equal error 
probability. 
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Table I. Modem Signal Designs 
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Design 

Name 

BSK 

L>4 AM 
Binary FSK 
QBBK 

Lb 2 APSK 
L»3 APSK 
MPSK, 8 phase 
Lb 2, 4 phase 
L=4 APSK 
L=2, 8 phase 
L=6 APSK 
L= 8 APSK 


Signal 

Design 

1 

! 

L 

+ 

□ 

ffi 



Bits per 
Baud 

1 

2 

1 

2 

2 

3.16 

3 

3 

4 
4 

5.18 

6 


Design 

Dimension 

1 

1 

2 

2 

1 

1 

2 

1 

1 

2 

1 

1 


Bits per 
Dimension 

1 

2 

1/2 

1 

1 

1.58 
1.5 
1.5 

2 

2 

2.59 
3 


4 
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III. THE (SOMETRIC H?00F OP THE CAPACITY THEOEtEM 


The capacity la bits per dimensioa is bounded by equation (3}> The 

geontetrlc derivation of tnis equation gives insist into tne efficiency of 

modulation signal designs. ^ The derivation assumes that there are M signals, 

each with energy less than or equal to E = D E^ . The transmission channel adds 

gaussian noise, with noise energy per dimensioa equal to Hq/2 (DHq/2 = 2TVHq/2 = IJT). 

When D becomes large, the average noise power becomes very close to N, and the 

noise perturbs the signal to some point near the surface of a sphere of xadius 

(DNq/ 2)^/^, centered on the original signal. For low error probability, the 

decisicn region of each signal must include the noise sphere surrounding the 

signal. The number of signals is bounded because the signals are within a sphere 

1/2 

of radius (DE^) ' , as the signal energy is bounded. Since the volume of a 

D dimensional sphere of radius r is B r^ and since (r-e)^ = Bp, r^(l-r/e) is 

D ^ D iJ 

DBich less than B^ r , for D large, nearly all the volume of the sphere of allowed 

signals, and nearly all the signals, are near the surface of tne sphere of radius 

1/2 

(DE_/ .\lthorgh tne length of the noise perturbation is very nearly equal to 

^ 1/2 

(DNq/ 2) ' , the noise distribution along the direction of the signal is gaussian, 

of zero exp)ected value, and tne noise perturbation is nearly ortnogonal to tne 
signal. The signal vector and an ortnogcnal noise vector are shown in Figure 1. 

Tne rigure axxLWS tne ccmputation or tne nuaoer of messages tttat can be distinguisned 

after transmission. The volume that must be allwed for each signal is B (DN 

1/ u 

the volume of the noise sphere. Tne total volume available for the noise spheres 


is B, 


b <“(/■ 


'2 + DE. 


) D/2 


rue oouiiQ Oil iS 


_ D/2 

Cjj = (l/D) logg M = (1/2) logg (1 + 2 Ejj / Nq) (4) 


This IS identical to equation (3). 
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IV. THE PERFORMANCE OF ONE DIMEI6IONAL APSK 


above geometric derivation of tne capacity theoremj vith the ccoverse 

proof that tne oound can be approached for large in^iee tnat in efficient 

signal designs > each signal decision region includes the noise spoere without 

much additional volume, and that all the decision regions fill the region of 

alloved signals. This result for large dimension is meaningful for one dimensional 

APSK, because APSK is actually used in a large number of dimensions. Consider 

the simplest case of two level APSK or FSK. During each pulse period, a value 

of ±1 is transmitted. As the transmission is repeated, in sucessive pulse 

periods or dimensions, the possiole signal points form a one dimensional array, 

then a two dimensional square, a tnree dimensional cube, and so on. After D 

pulse periods, the transmitted signal is a D dimensional vector (±1,±1,±1, ...±1), 

D 

Which is one cf the 2 vertices of a r dimensional nyperenbe. Although it was 

encoded, transmitted, and decoded as a sequence of one dimensional signc.ls. tne 

same vector could be tne result of a D dimendional design, fer example a j.ength 

D block code. Block coding reduces the number of possible signals below 2^ by 

using parity check bits, and so improves error performance by reducing the rate 

per dimension. In contrast to bandwidth efficient modulation, l<wer signal to 

noise is used while bandwidth is increased. 

How efficient is one dimensional APSK as D becomes large? We first observe, 

12 

after Wozencraft and Jacobs , that APSK leaves empty some of tne volume allcwed 
for tne received signal plus noise. In Figure 2, we see that there is ur.filled 

x/s 

area near ±(DEj^) ' on the orthogonal axes. Vfe also observe that the decision 
regions for each signal are cubic, rather tnan the optimum spherical shape. We 
compute tne number or possible messages for APSK, and compare it with the capacity 
bound for large D. 

1/2 

Tne maximum signal amplitude in D dimensions is (DE^) ' . This is the 

length from the center of tne APSK cube to the most distant signal. If aach 

1/2 

edge of a D dimensional cube nas lengtn 1, the cube diagonal nas length D . 

1/2 

The length of the diagonal is twice tne lengtn of the largest signal, 2 (DE_) 


so that each edge of the APSK cube has length 2 ^ 

1/2 , = 


^ 1/2 


1/2 

and extends from to 


+ E, 


in each dimension. Tiie transmitted signals are all in or on a cube of 
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Figure 2. Two dimensional section of APSK. 
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1/2 

side • As D becomes large, the decisioa region for eacn signal must 

include. the noise Sphere of radius (BN^2)^^ , as in the capacity theorem 
proof. The decision regions for AISK are D dimensional hypercunes. To contain 
a sphere of radius (DNq/ 2)^^^ , they naist have an edge of 2 (DHq/ 2)^^ . The 
received signal decision Regions are then all c(»itained in a hyperCube of edge 

2 (DHj/2)^/^ + 2 . 

The number of AI5K signals is exactly equal to the volume of the hypercube 
containing all the decision regions divided by hue volume of one hypercube 
decision regioa. 


( 2(DNy2)^/^ + 2 ( ® 

‘ (2(BN^2)V^)^ 

= ( 1 + (2 Ejj/ DN ® 


The rate per dimension, , is 


■*1, AiSK = logg M . Xosg (1 f (a V ) 

(5) 

For simplicity in comparison , we assume that is much greater than DNq/ 2, 
Which implies a nigh rate per dimension. Under this assumption, the following 
equations are approximately correct. 


AKK = logg (2 2^/DN^) 

Cjj * (1/2) logg (2 /Nq) 
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®D, AESK 


( 6 ) 


» (1/2) log 2 D 


Eacb time D is multiplied by U, tbe achievable rate drops one bit per dimension 
beloir capacity. For D = 10 approKimatelj' 4 , vhich is the dimens icmality 

of a three minute tele^dioae message, the rate of AESK is reduced 10 bits per 
dimension oelor capacity. Althou^ this result holds only for large dimension 
and large signal to noise, it is possible that four dimensional designs might 
gain up to one bit per dimension over one dimensional APSK. 

One dimensional AESK is a poor signal design for large dimension. This 
is because a cube is not very similar to a sphere in ni^er dimensions. Tne two 
causes of the inefficiency of AESK in large dimension, are that the cube of signal 
points occupies only a small portion of the circumscribed sphere of allowed 
signals, and that the noise spheres required for low error occupy a small portion 
of the circumscribed cubic decision regions. Good signal designs require a 
hi^ density packing of spherical decision regions that extends throughout the 
sphere of allcxfed signals. 
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V. TWO DIMEIBIONAL SIGNAL DESIGN 
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AfSK is tne best ooe dimensional signal design, since tne spnere of allwed 
signals and tne spnerical decision regions reduce to line segments in one dimen- 
sion. Tne optimum design spaces signals equally along a line. Tvo dimensional 

signal designs offer some scope for improvement. They have been studied by 

13 

Foscnini, Gitlin, and Weinstein . It is well known that tne densest packing 
of non- inters acting spneres (tne decision regions') in two dimensional space is 
defined by placing tne sphere centers on points of tne equilateral triangle or 
regular hexagon lattice, sncwn in Figure 3. For large M, tne optimum signal 
design is a circular region of tne equilateral triangle lattice. For small M, 
tne locally optimum signal designs are circular regions containing slightly 
irregular approximations to tne equilateral triangle lattice. Foscnini and his 
coauthors found tnat tne best designs for 8 and 16 signals are only l.O dB and 
0.5 dB better than one dimens ioiMil APSK. 

The potential of two design can be easily bounded. The gain in useacle 
signal volume from using all tne region witn less tnan tne max'^'m signal energy 
is tne ratio of tne area of a circumscrioed circle to tne area of an inscrioea 
square. If r is tne radius of tne circle, tne ratio is 

TTt^ 

ratio, full area = ~ 

2 T 


The gain in number of signals per unit volume from tne use of tne equilateral 

triangle lattice is tne ratio of tne APSK decision volume to tne equilateral 

lattice decision volume. In a lattice structure, tnere is one signal for each 

lattice cell, and eacn signal nas a decision volume equal to tne volume of one 

lattice cell. While tne decision regions in a square lattice are squares, tne 

decision regions in an equilateral triangle are hexagons, as snown in figure 3» 

and tne lattice cells are parallelepipeds. The area of a square cell of edge 2 
2 

is 2 or 4. The area of tne parallelepiped cell of edge 2 in tne equilateral 

l/2 

triangle lattice is 2 3 ' . The ratio of gain in signals per unit volume is 
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ratio, triangle lattice 




Considering botn tne increased area for signals, and tne reduced area required 
for eacn signal, tne number of signals in one dimeasiGnal AFSK can be increased 
by tne product of tne anove ratios, for two dimensional design. 


ratio = 



2 

7 ^ 



The maximum two dimensional design rate per dimension is 


^lAX 


= (1/2) lo^ ( * ratio) 


= F. 


APSK * ^°®2 ^ 


If tne number of signals is large, as for large signal to noise, so tnat little 
area is wasted, two dimensional design can approacn a rate 0.576 bits per dimension 
higher tnan APSK. 

Suppose tnat, instead of increasing tne number of signals transmitted with 

a given energy, it is desired to keep tne same number of signals and reduce tne 

maximum signal energy, siiice tne numoer of signals in a fixed area is increased 

1/2 

by 7T /3 , a fixed number of signals can be contained in an area reduced by 

1/2 2 1/2 2 
3 ' 171 . Since tne area is ft t - It (J. ■ ) = 7^ E, tne energy E can oe 

' 1/2 

reduced by 3 ' /TT ^ 2.60 dB. Tne APSK and equilateral lattice signal separ- 
ation are tne same, out tne numoer of neighboring signals is increased from 4 to 
6, and tne probability of error prcoably increases 50 percent. 

13 
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VI. FOUR DIMBISIONAL SIGRAL DESIGN 


We will consider signal designs In four dimensions, vnicn are expected 
to provide more improvement tnan two dimensional design. Four dimensional 

14 

designs nave been suggested for satellite communications by Welti and Lee 
and Welti Tne four ortnogonal signals defining tne four dimensions can be 
two sticessive pulses, each vitn two pnase-guadrature signals, but any four 
ortnogonal signals give equivalent perl'ormance . Comparisou of AP5K to tne 
capacity bound, equation (6) above, indicated tnat it may be possible to increase 
tne transmission rate by one bit per dimension using four dimensional design, 
witnout increasing signal energy and wnile maintaining tne same minimum signal 
separation and approximately tne same error probability. 

We first define tne gain acnieved by fiDling tne entire allowed region 
witn signals, ratner tnan using only tne nypercube APSK region. Suppose tnat 
a four dimensional cuoe nas edge equal to X. Tne cube diagonal is X^) ■^1'^ ^ 

, tne maximum signal amplitude is also X.' Tne 


A/2 


and since tnis is equal to 2 E 

I 4 . 

nypercuDe volume is X , and tne volume of tne circumsenoed nypersphere is 
•2 lb 


' 1 / 2 ) TT X , after Sommerville 
tne volume used by AFSK is 


TJae ratio of tne allowed signal volume to 


( 1 / 2 ) 


Assuming tnat the number of signals increases proportionately, tne increased 

2 

rate per dimension is (1/4) log, ( T( /2) = 0.5^ bits per dimension. Tne volume 

^ 2 l/^ 2 2 2 

of tne allowed signal spnere is (1/2) (E '‘") = (1/2) 7T E . If tne same 

2 

number of signals is reduced, tne volume can be reduced by 2/77^ , and tne 

1/2 

signal energy can be reduced by 2 ‘'-in , or 3*4b dB. 

This gain can be obtained by extending tne four dimensional Jaypercube lattice 
of APSK tnrougnout tne hyperspnere defined by the maximum signal energy. If tne 
signal separation is 2, tne radius or tne spnere included in tne decision region 
is 1. Then tne APSK design can nave a signal placed at tne point (0,0, 0,0), and 
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additional signals placed at tne points defined tsy tne four ortnogonal dis- 
placement vectors 


( 2 , 0 , 0 , 0 ) 

( 0 , 2 , 0 , 0 ) 

( 0 , 0 , 2 , 0 ) 

(0,0,j,2) 


Tbe mlnlmijm signal separation is 2 , and additional signal points can be added 
by adding or subtracting tnese four displacement vectors from ar<y previous 

o ’2 2 2 

signal point. All latice points (± 2 a,± 2 o,± 2 c,± 2 d) suer tnat 2 (a +b +c +d ) l /2 

is less tnan or equal to do not exceed tne a 3 J.awed signal energy. 

Suppose tnat a point at ( 1 , 1 , 1 , 1 ) is added to tne four dimensional hypercube 

lattice. Tnis point, llKe tne original members of the hypercube lattice, has a 

distance of 2 from all tne nypercuoe lattice points, and tnerefore has a decision 

region wnicn includes a spnere of radius 1. (Distance is computed by tne Euclidean 

/ 2 2 2 2 , 1/2 

formula, d = (w +x -t-y +2 ) , wnere w,x,y, and z are distances on ortnogonal 

axes.) For each point in tne original hypercuoe lattice, a new point can be 
added at displacement ( 1 , 1 , 1 , 1 ). The number of messages in tne allowed signal 
hyperspnere is doubled. It is obvious ti at tne transmission rate is increased 
by one bit in four dimensions, or 0.25 bits per dimension. 

The volume of eacn decision region in tne original hypercube lattice is 

4 3 

2 , out tne decision volume is reduced to P. ^ and is no longer cubic, wnen tne 

number of signals is douoled. Since tne volume per decision region is one-naif 

tne original volume, tne original number of signals can be accomodated in one-naif 

2 2 

tne original volume, in tne new lattice. The nyperspnere volume is ( 1 / 2 ) 7 T E , 

1/2 

so tnat E can be reduced 1/2 ' ^ or 1.51 

The new denser lattice can be described as a body centered cubic lattice, to 
indicate tne signal added at tne cuoe center. The cuoe center signal has sixteen 
neighbors at the vertices of tne original hypercube, defined by (0,0, 0,0), (2, 0,0,0), 
(0,2, 0,0,), ... , (2, 2, 2,2), and eignt neignoors at tne centers of eight neighboring 
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aypercubes, located at (1,1, 1,1) displaced oy (±2,0,0,0), (0,±2,0,0), (0,0, ±2,0), 
and (0,0,0, ±2). Eacn hypercuoe vertex signal similarly nas twenty-four 
neighbors. 'I addition to tne eignt original lattice signals at (±2, 0,0,0), 
(0,±2,0,0), (0,0,0, ±2), and (0,0,0, ±2), it nas new neignoors at tne centers 
of tne sixteen cypefcaoes tnat meet at (0,0,0,0), Tnese signals are at '■.ne 
vertices of a nypercuoe (±1,±1,±1,±1) . ' 

An identical lattice can aisooe constructed as an alternate vertex cubic 
lattice. This construction will also be given, since it uses tne familiar idea 
of a parity cneck, and provides a simple-^ description of signal designs and tne 
required receiver. Consider tne cubic lattice defined by tne zero point (0,0, 0,0) 
and tne four ortnogonal vectors 


(± 2^/2 0 , 0 , 0 ) 
(0,±2^/^,0,0) 
( 0 , 0 ,± 2 ^/^, 0 ) 
( 0 , 0 , 0 , 12 ^''^) 


If tnese defining vectors are designated as , and X^ , tne cuoic 

lattice points are all at tne tips of vectors of tne form a^ X^ ’’’ ®2 ^2 ^ ^3 ^3 


t a. X. , 
4 4 


.ere a 


i ^2 * 


a , and a, are integers. If a parity ctteck is placed 

j •+ 


on tne sum of tne vector coordinates, so tnat *^3 ^ \ must be even or 

odd, tne vector tips are points on an alternate vertex cubic lattice. Tbe signal 

1/2 

separation of tne original cubic lattice is 2 ' . Tnis separation is increased 

to 2 by tne parity requirement, whicn insures tnat if two signals differ by 2 
in one dimension, tney differ by 2^^^ in two ortnogonal dimensions. Tne alternate 
vertex cubic lattice nas tne same signal separation as tne body centered cubic 

= 4. Tnis 


lattice. Tne volume of tne original cuoic decision region is (2 
volume is doubled, and changed in snape, wnen only tne even or odd parity vertices 
are used. The axternate vertex cubic lattice nas the same decision volume as tne 
body centered cuoic lattice. 
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It cau be sbotm that each signal in the alternate vertex cubic lattice has 
2h nei^bors, as in the body centered cubic lattice. For the alternate vertex 
cubic lattice, the number of nei^bors is the number of ways to add two ortho- 

1/2 

gonal vectors of value ±2 ' to the given signal. There are six ways to choose 
2 out of four dimensions, and U ways to assign two signs, so each signal has 2h 
neJ . ■lors . If the four dimensions are w,x,y, and 2, the 6 pairs are wx, wy, wz, 
xy, xz, and yz. Each pair dsfines a two dimensional plane section of the alternate 
vertex cubic lattice, as shown in figure U. A signal has U neighbors in each plane. 
In both constructions, each nei^boring signal has 4 nei^bors which are at dist- 
ance 2 from itself, and also at distance 2 form the original signal. 

Althou^ the boly centered cubic lattice and the alternate vertex cubic 
lattice are identical struct'xres, the original defining lattices differ in scile 
and rotation. The original lattice for the body centered cubic construction has 
one-half the final number of signals, and the original lattice for the alternate 
vertex cubic construction has twice ohe final number. 

It is not difficult to visualize four dimensional structures. Figure 5 is 
a two dimensional hypercube, similar to the familiar projections of the three 
dimensional cube. The projection contains 8 interlocking three dimensional cube 
projections. The hypercube projection contains I6 vertices, where 4 lines meet. 
These 4 lines meeting at each vertex are the 4 orthogonal vectors, ±w, ±x, ±y, and 
±z. Each dimension is always represented by the same orientation in the projection, 
as shewn. To go from point (0,0, Q,0) to (1,1, 1,1), it is necessaiy to move along 
a vector from o to 1 in each dimension. Each restrictive equation, say w = 0, 
reduces the dimensionality of the figure by one. The 8 three dimensional cube 
projections are defined by w = 0,1; x = 0,1, y = 0,1 and z = 0,1. The 8 three 
dimensional cube projections are joined in pairs. When one three dimensional cube 
is identified, it can be seen that there is an extra line having the same slope 
at each vertex. Following all these parallel lines to the next vertex iden^ifies 
a new three dimensional cube, parallel to the original xn four dimensional space. 
The bounding of the four dimensional cube by three dimensional cubes is similar 
to the bounding of three dimensional cubes by squares, and the bounding of squares 
by line segments. If a body centered point is added, it is located in the center 
of the projection. The vertices of even parity are circled. 
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The alternate vertex cubic lattice is the densest lattice in four dimensions, 

17 

as discussed by Leech . There is no possible denser packing of the non- 
intersecting noise spheres contained in the decision regions. Since it has been 
shovQ that circular regions of the densest tvo dimensional lattice approximate 
the most efficient two dimensional signal designs, it can be reasonably conjectured 
that four dimensional hypersphere regions of the densest four dimensional lattice 
w ill approKinate the best four dimensional designs. 

Suppose that it is desired to increase the number of signals while using the 
same maximum signal energy. Using all the available signal volume gains 0.^8 
bits per dimension, and using the densest lattice gains 0.23 bits per dimension, 
so that using both produces a rate increase of 0.83 bit per dimension. Suppose 
that it is desired to keep the same number of signals and reduce maximum signal 
energy. Since the number of cubic lattice signals is increased by "V /2, by 

using all the volume, a fixed number of signals can be contained in a volume 

2 2 2 
reduced by 2/<^ . Since the four dimensional volume is {JT /2) E , E can be 

1/2 

reduced by 2 ' nr , or by dB. Similarly, using the densest lattice increases 
the number of signals by 2, a fixed number of signals can be contained in a volume 

1/2 

reduced by l/2. E can be reduced by l/2 ' , or 1.51 dB. Using both all available 
volume aikl the densest lattice, E can be reduced by l^,or 4.97 dB. These gains 
are limits that can be approached for hi^ signal to noise, and a large number of 
signals. For lav signal to noise, crowding may cause wasetd volume. Although 
the signal separation is held constant, increasing the number of neighbors from 
16 to 24 in the densest lattice design probably increases error probability about 
50 percent. 
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VII. SPECIFIC CONFIGURATIOIB OF FOUR DIMEICIONAL SIGNAIS. 

The familiar one and two dimensional signal designs of Table I have rates 

between 1 and 3 bibs per dimension. We will describe the rate per dimension 

and the ratio of maximum signal energy to signal separation for one dimensional 

AF5K> the extended cubic lattice, and the alternate vertex cubic lattice. 

Because of the simplicity of the one dimens ioimil AF5K, the required parameters 

are easy to compute. Suppose that the AFSK design has L levels. For L odd, the 

L different signal levels are (L-l)/2, (L-3)/2, ..., 2, 1, 0, -1, -2, ... -(L-l)/2. 

For Lsl, (0,0, 0,0) is the only signal. For Ls3» the value in each dimension can 

4 

be -1, 0, or 1. The number of signals is M = X , in four dimensions, and the 
rate per dimension R = (l/4) lo^ M = log^ L.' Since the error behavior depends 
on the ratio of signal separation to noise variance, signal designs at a given 
rate can be ranked by the ratio of peak signal amplitude to signal separation. 

The signals are contained in a hypercube, having L-1 units on an edge. The 
hypercube diagonal is twice the length of the maximum energy signal. 


(4(L-1)^)^/^ = 2 
4(L-1)^ = 4 E 



The distance between signals, d, is 1 unit. 


E^/^/d = L-1 


For this case, which will be designated case 1, L is odd and (0,0, 0,0) is a 
signal point. For case 2, L is even and the possible signals in each dimension 
are L-1, L-3, 5, 3, 1, -1, -3,-5. ••• -Lfl. For L=2, there are 16 sign'"’'^ 
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(±1,±1,±1,±1) , the hypercube vertices. For case 2, M = L > as before. The 
hypercube diagonal is twice the largest . .giml. 

(4(2(L-1))^)^/^ = 2 
U(L-l) = 2 

E = 2(L-1) 

1/2 . 

The minimum signal distance, d, is 2 units, so that E /d is as before. 

E^/^ /d = L-1 


The transmission rate per dimension, and the ratio of peak signal to signal 

1/2 

separation are the same for L odd or even. E ' /d is platted versus R in figure 
6, for L = 1, 2, ... 8. 

The two new classes of signal design introduced above will be described. 

In the first class, the cubic lattice is extended beyond the APSK hypercube, to 

include all cubic vertex points not exceeding the maxinuim allcwed signal energy. 

In the second class of designs,. the alternate vertex cubic lattice is used in the 

18 

same allowed signal region. The permutation codes of Slepian , which are 

described below, are alternate vertex cubic lattice designs in which all signals 

have the maximum energy. In a small number of dimensions, such equal energy designs 

have significantly lower rate than desigr including points with less than the 

maximum energy. Slepian did not specifically consider the al^erilate vertex cubic 

14 

lattice or dimension of four. Welti and Lee , in the consideration of four 
dimensional designs, include some examples of ARSK and of the two new general 
classes of designs, but do not mention the- alternate vertex cubic lattice. 

In the extended cubic lattice signal design, all points of the cubic lattice 
having less than the maximum signal energy are used as signals. For case 1, the 
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cubic lattice includes aU points of the form (a,b,C}d), where a,b,c, and d 

are any integers. The signal points can be enumerated in order of increasing 

energy. There is 1 point with E=0, (0,0,0, 0) ; there are 8 points with E=l, 

(±1,0,0, 0), (0, ±1,0, o'), etc.; arid there are'24 points with E=2, (±1, ±1,0,0), etc. 

Any lattice point (a,b,c,d) has energy, E=a + b + c + d . If the value a 

in the first dimension is exchanged with the value b in the second dimension, or 

if a is replaced by -a, the energy, E, is unchanged. Thus, any lattice point 

(a,b,c,d) defines a shell of constant energy, containing all the points described 

by permutations and sign changes of the values a,b,c, and d. These equal energy 

13 

signal sets were called permutaion codes by Slepian , who gave the formula 
for enumeration of the points on a shell. 

The enumeration of the signals in an equal energy shell can be briefly 
explained. Suppose we have a lattice point in D dimensions, (a^^, a^, a 2 ,...aj^). 
We can select one of D values for the first dimension, c^e of D-1 values for the 
second dimension, etc., so that there are Dl possible orders of the coefficients. 
If each may be positive or negative, there are Dl 2^ signals. This number must 
be reduced if some of the have the same magnitude, and are therefore indistin- 

guishable, or if some of the a^ are zero, so tht the sign is meaningless. If 

the ceofficients include m zeros, the number of wo. its in the shell is DI 

o 

If the next largest coefficient is repeated m^ times, the m^I ways of ordering 
this coefficient in the m^ dimensions selected for it are indistinguishable. The 
formula for the number of signals in a shell is 


.. d: 

qIq • • • • • * 

k is the number of different coefficient magnitudes. 

In this work, D = U. The number of points on the shell of (1,0, 0,0) is 

4—3 

4l 2 / 3 I 11 = 8, corresponding to four dimensional locations for the 1, and 

positive or negative sign in each location. The number of points on the shell 

4-2 

of ( 1 , 1 , 0 , 0 ) is 4l 2 / 2 I 21 =24. The shells for E less than or equal to 16 
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are given in Table II. For several values of E, including 2 nad 3, tvo different 
sets of Integers have tha same E. The total number of messages, M, for any peak 
signal energy, E, is obtained by adding the counts of the included shells. 

Since the minimum signal separation is 1, E^^^/d = E^^^. Table II was extended 
to E = 36, and selected designs were plotted in Figure 6. As expected, the 
extended hypercube lattice designs are superior to AJSK, since for fixed signal 
separation the full latice provides hi^er rate at the same maximum signal energy 
or lower energy at the same rate. 

Table II can also be used to enumerate the messages in alternate vertex 
lattice signal designs. The alternate vertex cubic lattice consists of the 
cubic lattice points having the sum of their coordinates even or odd. Each 
permutation shell contains points having the same parity, because permuting the 
coefficient order does not alter the sum of the coefficients and because changing 
the sign of a^ changes the coefficient sum by 2 a^, which does not alter parity. 
Each permutation shell lies entirely in one of the two alternate vertex lattices, 
defined by even or odd parity. The shell parity is odd when E is odd, and even 
when E is even. This is shewn as foUews. If the sum of the coordinates is even, 
there is an even number (possibly zero) of odd coordinates. Since the squares 
of even numbers are even and the squares of odd numbers are odd, E is the sum of 
even numbers and an even mumber (or zero) odd numbers, and E is even. Similarly, 
if the sum of the coordinates is odd, there is an odd number of odd coordinates, 
and E is c"*!. The alternate vertex signal design includes all the shells for E 
even or odd. The extended (E> 36) Table II was used to compute M. Since the 

i 72 i/2 1/2 

minimum signal distance is 2 ' , E ' /d = (E/2) ' . Selected alternate vertex 
designs are plotted in Figure 6. 

A table similar to Table II was prepared for case 2, in which only vectors 
of all odd coordinates are signal points. Some of the points derived using this 
table for the full cubic lattice design and for the alternate vertex lattice 
design are also plotted in figure 6, and these points are specially designated. 
When R is small, the displacement of the center of the signal set from a vertex 

1/2 

point to a cell center generates differ, t combinations of E ' /d and R, but 
the signal designs for R large are similar. 
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Table II. Permutation shells. 


Energy, E 

Defining Point 

Number of 
Messages , 

0 

(0,0, 0,0) 

1 

1 

(1,0, 0,0) 

8 

2 

(i,i,o,o) 

24 

3 

(1, 1,1,0) 

32 

4 

(1,1,1,1) 

16 


(2, 0,0,0) 

8 

5 

(2,1,0,0) 

48 

6 

(2, 1,1,0) 

96 

7 

(2,1,1,1) 

64 

8 

(2, 2,0,0) 

24 

9 

(2,2.1 . 

96 


(3. 

8 

10 

(2,2, 1,1) 

96 


(3,1,0,0) 

48 

11 

(3,1,1,0) 

96 

12 

(2,2,2,0) 

32 


(3,1,1,1) 

64 

13 

(2,2,2,!) 

64 


(3,2,0, 0) 

48 

14 

(3,2,1, 0) 

192 

15 

(3,2,1, 1) 

192 

16 

(2, 2, 2, 2) 

16 


(4,0,0, 0) 

8 
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Figure 6 shors that, compared to AF5K, the two new classes of signal designs 

for four dimensions can he used to achieve hi^er rate per dimension for the same 

maximum signal energy and signal separation. Comparison of APSK to the capacity 

bound showed that rate ^ins of up to 1 bit per dimension might be achieved by 

four dimensional design. Confutation showed that extending the cubic lattice 

throughout the allowed signal region could gain up to 0.^8 bits per dimension. 

From Figure 6, the better designs at higher E^^^/d actually gain 0.35 bits per 

dimension. Computation also showed that use of the alternate vertex cubic lattice 

could gain an additional 0.2^ bits per dimension. This is achieved at higher 

E ' /d, and is sometimes sli^tly exceeded because the alternat > vertex designs are 

i/2 

derived from cubic designs with hi^er E ' /d, having less acidentally wasted volume. 
The largest total gain is about 0.6 bits per dimension, which is a substantial 
portion of the computed total gain of O.83 bits per dimension. 

Figure 6 also shows how much the two new classes of signal design allow the 
energy to be reduced at a given rate and signal separation. Previous computation 
showed that use of the full cubic lattice would allcjw an energy reduction of 3«^ 
dB. As shewn by the figure, the full cubic lattice allows the signal energy, E, 
to be reduced I.58 dB at 2 bits per dimension, and 2.58 dB at 3 bits per dimension. 
In the computation of the previous section, the use of the alternate vertex cubic 
lattice gave a further reduction of 1.51 dB, for a total of 4.97 dB. Actual 
alternate vertex designs allow the energy to be reduced 2.9O dB at 2 bits per 
dimension, and 3*95 dB at 3 bits per dimension. V/hile the full computed gain 
should be achieved at high rate and signal to noise, the actual gains for useful 
rates are a significant portion of the possible gains. 
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Vin. FCWR DIMEMSIONAL SIGNAL BKJEIVERS 


For equally probable signals, the receiver should select that member of the 

transmitted signal set which is closest to the received signal. This can be done, 

in general, by computing the distances between the received signal and all the 

members of the signal set, and has been suggested for four dimensional designs by 

14 

Welti and Lee . This complexity is obviously nob required for AFSK, and for 
the claos of designs using the extended cubic lattice, since each dimension can 
be detected independently. In the full cubic lattice designs, the results of 
four one dimensional receiver operations are used to decode the signal message. 
Distance computations can reduce the error probability, when errors cause the 
selection of lattice points beyond the allowed signal sphere. 

The receiver for the alternate vertex cubic lattice is only sli^tly more 
complex than the receiver for AFSK or the full cubic lattice. The receiver 
operation will first be outlined, and then explained in detail. The receiver 
initially detects the signal operation in each dimension, as if the signal set 

1/2 

were the full cubic lattice with signal separation 2 . The detection process 

makes a primary decision, defining each orthogonal signal to some multiple of 

1/2 

2 , and also retains 3 bits of secondary information, indicating which of the 

four orthogonal signals has the largest error, and in what direction. If the 
primary decision lattice point is verified as a member of the alternate vertex 
cubic lattice by the parity check, it is accepted as the correct transmitted 
signal. If the primary decision lattice point is not a member of the alternate 
vertex lattice, the transmitter signal closest to the received signal is one of 
the primeury decision point's ei^t neighboring alternate vertex lattice points. 

1/2 

These lie at ±2 ' along each orthogonal dimension form the primary decision point. 

To understand the decision regions and the effect of noise, refer to Figure ?• 
If the error in each orthogonal signal is less than 2 ' /2, the transmitted signal 
is correctly defined by the primary decision. The primary decision region is the 
decision region of a point in the full original cubic lattice. If one orthogonal 
signal has an error greater than 2 ' /2, and the other three signals have error 

1/2 

less than 2 ' /2, the primary decision selects a point which is not a member of 
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the alternate vertex lattice. To attempt the correct decision, the original 

cubic lattice decision region must be augmented by portions of the original 

cubic decision regions belonging to deleted neignbor points. The deleted point 

indicated by the primary decision is equally distant from 8 neighboring alternate 

1/2 

vertex points, lying at ±2 ' along the four orthogonal axes, and the 8 required 
alternate vertex decision regions intersect at the primary decision point. The 
needed decision information is the error vector from the primary decision point 
to the received signal. 

The decision region of a signal includes all volume closer to ^ At signal 
than any other s..gnal. This decision rule can be implemented by (1) retaining 
the error vector, (2) adding it to the primary decision point, (3) computing the 
exact distances to the 8 neighbors of the primary decision point, and (U) selecting 
the signal with smallest distance to the received signal. An equivalent computation 
( 1 ) takes the scalar or dot product of the error vector with the 8 displacement 
vectors between the primary decision point and the neighbors, and (2) selects the 
signal lying in the direction closest to the error vector, that is, selects the 
signal with the largest scalar product. Since the 8 neighbor signals lie along 
the four orthogonal signals defining the four dimensions, this is the equivalent 
of choosing the signal in the direction having the largest estimated error, accord- 
ing to the secondary information provided by the reciever. 

We further discuss the decision regions, and the causes of a receiver error. 
Since the minimum signal distance is 2, The receiver always selects the correct 
signal when the error is less than 1, the radius of the hypersphere contained in 
the decision region. The primary decision is correct when the error magnitude is 
less than 2^'^^2in each dimension, since this defines the full cubic lattice 
decision region. As can be seen from Figure 7» if the error in any dimension is 

1/2 

greater than 2 ' , or if the error in each of two dimensions is greater than 

1/2 , 

2 ' /2, the received signal is closer to an incorrect alternate vertex point 
than to the correct one, and the correct decision is impossible. The receiver 

1/2 

sometimes selects the correct signal for error magnitueds between 1 and 2 ' , 

1/2 

since the decision regions extend ±2 ' along the orthogonal signal directions. 
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alternate 'i artex original cubic 



Figure 7. The cubic lattice in two dimensions, with the 
alternate vertices indicated by crosses, and with the 
decision regions indicated. 
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Primary decision errors are corrected by the secondary decision 'when the 

magnitude of the apparent error (from the incorrect primary decision point to 

the received signal) is largest in the orthogonal direction of the correct 

signal. Suppose that the actual error vector, between the transmitted signal 

and the received signal, is (e^^ , eg , e^ , . When the primary decision is 

correct, the receiver secondary information designates the largest actual error. 

When the primary decision is in error, the secondary information is based on the 

vector between the incorrect primary decision point and the received signal. If 

the largest error is greater than and less than 2^/^ ^ and all the other 

errors are less than 2 ' /2, the primary decision is in error, but there is a 

possibility that the secondary decision is correct. The receiver will estimate 

the error magnitude in the direction of the actual largest error as e^ = 2 ' - |e^| * 

r\ther than e^ , but the other ttiree error estimates are correct. If the largest 

error in in the first dimension, the estimated error is , e , e > ®i. )> where 
^ I 1/2 X 2 J 4 

le^l = 2 ' -le^lis incorrectly estimated. The secondary decision produces the 

correct signal when the estimated error is largest in the direction of thecorrect 

signal. 


> ISgl 

l"ll > 1*3! 

I'll > /'4I 

*L /p 

Using je^l = 2 ' -|e^|, the requirements for a correct decision are 

I'J +l«2l<2 
I'll * I'J < 
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AUoiring the largest error to be in the second, third, or foirth dimensions 
gives three additional equations. 


®2 ®3 


e, 

«3 + «4 < 


In agreement with she verbal discussion, there is an error if any e. > 2 ' , 

1/2 1/2^ 

or if any two e. > 2 ' /2, and there is no error if all e^^ < 2 ' /2, 

If the inequality is replaced by an equality, each equation defines four 

decision boundaries in two dimensions, one boundary corresponding to each possible 

selection of signs instead of absolute values. The decision boundaries descrim- 

inate between a signal and its four neighbors in some plane, as shewn in Figure 7. 

The six equations c< respond to the six different sets of two dimensional planes 

1/2 

in four dimensions. Each boundary of the form e. + e. = 2 • , for signed e. and 

e. , places one coristraint on four dimensiexml space, and so defines a three 
J 

dimensional decision region boundary. Taken together, the six equations each 

with four sets of ngns define the 2k three dimensional boundaries of the four 

dimensional decision region, so that there is one boundary for each neighbor. 

The probability of a correct decision can be coesputed by integration of the 

four dimensional probability density over the four dimensional decision region. 

It is assumed that the probability density i? independant gaussian with eq> ..1 

variance in each dimension. These computations have not been performed, but 

14 X5 

related results appear in the literature * . The correct decision probaoility 

can be underbounded by integrating t‘ probability density over the spherical 

region c radius 1, the underboun'' he decision region. This spherical region 

2 2 2 2 2 2 
is defined by X = e, + e^ + ®). that the variable X has the 

chi-square distribution and the integral is tabulated , 
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m. COBCtUSION 


In bandwidth compressive modulation, increased signal energy is used to 
achieve higher transmission rate. Familiar multilevel amplitude^ phase shift 
keying (AP5K) was shown to be an inefficient signal design in higher dimension, 
compared to the capacity bound. Four dimensional designs were shown to offer 
some potential improvement, and two new classes of four dimensional designs 
were introduced. Each class was devised to correct one of the faults of the 
PASK design. The first class of designs extends the four dimensional cubic 
lattice throu^out the hypersphere bounded by maximum signal energy. The 
sec<xid uses the densest four dimensional lattice, the alternate vertex cubic 
lattice, to define the signal points. This lattice provides the smallest 
decision region volume having a fixed minimum signal separati«'n. E(y analogy 
with two dimensional results, it is conjectured that alternate vertex lattice 
signal designs are similar to the optimum four dimensional designs in geometry 
and performance. At 2 to 3 bits per dimension, the densest lattice designs 
increase the rate by 0.6 bits per dimension over AP3K, with fixed maximum 
signal energy and signal separation. Alternately, they allow maximum signal 
energy to by decreased by 3 or 4 dB, with fixed rate and signal separation. 

The receivers ror the two new classes of four dimensional signal designs are 
very similar to APSK recexvers. The receiver for alternate vertex lattice 
designs requires a small amount of additit.'al information, the identification 
of the si^jiial dimension and signal direction having the largest estimated error. 
The new four dimensional signal designs are suitable for the implementation 
of bandwidth compressive modulation. 


33 



references 


(1) J.J. Spilker, Digital Commimications by Satellite « Englewood Cliffs, N.J. 
Prentice-HaU, 1977, p. 32U. 

(2) W.J. We''er, III, P.H. Stantcm, and J.T. Sumida, "A Bandwidth Compressive 
Modulation System Using Multi** Amplitude Minimum Shift Keying (MA$CK)," 

IEEE Trans » Cotanaiications , COK-26, pp. 5^3-551. 

(3) G.R. Welti, and R.K. Kwan, "Compa’^ison of Signal Processing Techniques 
for Satellite Telephony'," MIC *77 Conference Record , IEEE 77CH1292-2 CSCB, 
1977, pp. 05:1-1 to 05:1-6. 

(4) G.R. Welti, "PCM/FDMA Satellite Telephony with 4-Dimens ionally-Coded 
Quadrature Amplitude Modulation, CO!EAT Technical Review, vol. 6, No. 2, 
1976, pp. 323-338. 

(5) , "Guide to Modems," Computer Decisions , 1973 » PP. 36-40. 

(6) C.E. Shannon, "Cooaunication in the Presence of Noise," PToc. IRE , vol. 37, 

1949, pp. 10-21, 

(7) C.E. Shannon, "Probability of Error for Optimal Codes in a Gaussian 
Channel," Bell Syst. Tech. J. . vol. 38, 1959 » PP. 6II-656. 

(8) J.M. Wozencraft, and I.M. Jacobs, Principles of Comniinication Engineering , 
New York, Wiley, I965, p. 2g4. 

(9) J.M. Wo 'craft, and I.M. Jacobs, op. cit., p. 321. 

(10) J.R. Pavey, "Modems," Proc. IEEE , vol. 60, 1972, pp. 1,284-1,292. 


34 



(11) J.M Wozencraft, and I.M. Jacobs, op. clt., pp. 323**332. 

(12) J.M. Wozencraft, and I.H. Jacobs, op. cit., pp. 317**320. 

( 13 ) G.J. Foscbinl, R.D. Git3Jji, and S.B. Weinstein, "Optimization of Ttfo- 
Dimensional Signal Constellations in the Presence of Gaussian Noise," 
IEEE Trans. Coanunications. CCM-22, 197*»» PP» 28-37. 

(Ih) Q.R. Welti, and J.S. Lee, "Digital Transmission with Coherent Four- 
Dimensional Modulation," TKKF. Trans. Information Theory , vol. IT-20, 

I 97 U, pp. 497 - 502 . 

( 15 ) G.R. Welti, op. cit. 

(16) D.M.Y. Sommerville, An Introduction to the Geometry of N Dimensions , 
New York, Dover, 1958. (Republication of the edition of I929.) p. 137. 

(17) J. Leech, and N.J.A. Sloane, "Sphere Packings and Error Correcting 
Codes," Canadian J. -*<!ath. . vol. 4, 1971, PP. 718-745. 

(18) D. Slepian, "Permutation Modulation," Proc. IEEE, vol. “^3, 1965, pp. 

228- 236. 

(19) A. Papoulis, Probability, Random Variables, and Stochast.LC Processes, 
New York, Me Graw Hill, I965, p. 250. 

(20) M. Abramowitz, and I. A. Stegun, Handbook of Mathenatical Functions , 
Washington, National Bureau of Standards, 1970, p. 940. 


35 



APPENDIX A 
REVIEW CF THE TRW 
MULTIPLE ACCESS STUDY 


36 


i 



This report discusses the "Mobile Multiple Access Study" prepared for NASA* 
Goddard by TRW, and considers hov this effort affects the work planned by NASA-Ames. 
The TRW study was performed with limited objectives, budget, and schedule, and Is 
generally responsive to Goddard's RFP. 

The study objectives were: 

1) to consider FDMA, TDMA, and CDMA multiple access techniques, ai»l to select the 
best based on terminal cost, operating proceeoures, system capacity, and satellite 
complexity and cost, 

2) to describe the system using the selected multiple access technique, including 
hardware parameters and operation proceedures, and 

3) to design and estimate the cost of the mobile terminals. 

This study Is part of the Public Service Communications Satellite development, 
and several important system constraints are given in the RFP. There is a single 
channel per terminal. Implying single channel per carrier, and continental US cover- 
age is required. The transmitted signals are as follows; 

1. Voice; Narrcvband FM, BW = 10 kHz, C/M = UVdB-Hz 

2. Data: PCM-PSK, 75 and 300 bps, E/Nq = 10.3 dB 

3. Fax: 1200 or 2U00 bps, E/Nq = 8 dB. 

Possible system configurations include; 

1. several or multiple fixed beams 

2. UHF or L band 

3. user to user/ user to central station with connection to an 
existing network. 

The selected multiple access technique is FDMA, because of its simplicity and 
greater voice channel capacity. FDMA's simplicity is reflected in Icwer terminal 
cost and less complex operation. Although the comparison of multiple access techni- 
ques is not given in the required parametric form, TRW's reviews of the techniques 
are of interest. For FDMA, a random access order wire and central station frequency 
assignments are recommended. As in SPADE, the carriers are voice activated to save 
transponder pcwer. It seems generally accepted that FDMA is the simplest and most 
efficient multiple access technique. 

TDMA and CDMA require a digital voice modulation, rather than the specified FM. 
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FDMA is optimum for digital modulation as well as FM, and FDMA/ISK is used in 
UfrKl£AI*s well known SPADE system. Althou^ SPADE used 7 1>it PCM and QPSK, contin- 
uously variable slope delta modulation is more nearly competitive with FM, which is 
considered best. Delta modulation is less sensitive to errors than PCM^ and there 
has been LSI development for both techniques. 

The TBW discussion of narrowband FM is ^'omewhat puzzling. uses Carson's 

rule for the bandwidth of wideband FM, and the signal-to-noise inqirav’ement fortmila 
for wideband FM, in the discussion of narrowband FM. ' It is well known that the band- 
width of a narrofband FM signal extends from the carrier to plus-and-minus the high- 
est baseband frequency, exactly as in ordinary AM. Narrowband has essentially the 
same slgnal-tc-noise performance, at hi^ signal-to-noise ratios, as ordinary AM. 

This is why wideband FM is most frequently used. The apparent reason for using narrow- 
band FM is to reduce voice channel bandwidth in the satellite transponder. The given 
bandwidth of 10 kHz can be interpreted as twice the hipest voice frequency (4kE£z), 
plus a guardband (2kHz). TRW allows 25kHz per voice channel, apparently 10 kHz fc«r 
the modulated voice and a 15 kHz guardband. It is possible to reduce the transponder 
guardbands and bandwidth, or to use wideband PM =4, BW = 20 kHz), or to use 
digital modulation. 

The study presents a user traffic model, and concludes that T2 multiple access 
channelscan accomodate 1,200 heavy users or 10,000 light users with one-tenth of the 
attempted calls receiving a busy signal. 28 channels are per!sanently assigned to 
very heavy users. The user model defines a heavy user as one making five cell.e, cf 
ten minutes duration, in an eight hour day. A li^t user makes two three minute calls. 
No actual data are given, and this loading seems optimistic. The call attempt rate 
parameter, ^ , is incorrect on page 3-2, and correct on page 3-5 • 

The FDMA system design has several interesting results. Systems are described 
using a single beam, four time zone beams, and ei^t north-south time zone bearis. 
Although the eight beam system accomodates 400 rather than 100 voice channels, the 
power and weight requirements favor the four beam system. UHF rather than L band is 
chosen simply for a 6 dB reduction in path loss. There are no linear, space qualif- 
ied UHF amplifiers available, and TRW has submitted a proposal to develop one, which 
would use twenty-four transistors in parallel. Using a K^ band downlink to a central 
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station would save UHF bandwidth. 

Block, diagrams end cost estimates are developed for FDMA., TDMA., and CCMk mobile 
ground terminals. The c<^ts for full capability (transmit, receive, multiple access) 
FDMA, TDMA, and CDM, terminals are $1,9CX), $2,500, and $2,500 per unit for 10,000 
units. About two- thirds of these costs sure for component parts. The quoted costs 
exclude the R&D and production design cost, which is estinated at 2 to b mil lion 
dollars, or an additional $200 to $b00 per unit for 10,000 units. 

TRW considers the following topics worthy of further study; 

1. steerable mobile antenna 

2. spacecraft UHF multibeam antenna 

3. 25 kHz transponder filters, surface wave acoustic 

U. UHF linear transponder (proposal submitted) 

5. automated network control 

6. con' •>! system design, queing, priorities 

7. multipath and RFI 

8. passive IM 

9. transponder H>1 

10. multipaction (law pressure electron resonance) 

11. K^ , Kg band mobile equipment 

12. user traffic model 

13. tolerable delay and busy 

lb. A/O requirement 

15. privacy requirement 

16. voice quality i^equirement 

In view of the major emphasis on system configuration and terminal hardware in 
this study, and the fact that FDMA/FM is generally accepted to be superior for large 
numbers of users, it was useless to wait for these results before issuing the Ames 
RFP. 

The interesting question seems to be, "Hew close can digital voice approach FM?" 
While FM is superior in both EIRP and bandwidth requirements, digital voice has useful 
features that mi^t counterbalance its disadvantages. Assuming that the Public 
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Service Co(smunications Satellite will have a mobile service traiffipooder designed 
for FDMA/fm, Ames should investigate the possibility of a compatible digital voice 
experiment, perhaps using laboratory rather than fully packaged equipment. 

There are several reasons that digital voice should be considered; 

1. gradual conversion of land line links to digital 

2. cost reduction for large quantities by LSI development 

3. digital voice compatible with data 

U. digital voice easily scrambled for privacy 

5. delta modulation more tolerant of marginal signal-to-noise 

6. digital can use regenerative satellite repeaters, in later 
applications 

7. digita"" modulation can be made narrow band by bandwidth compress- 
ive modulation 

FM is a widebemd technique, suitable for the older generation of power limited sate- 
llites. Current satellites are bandwidth limited, as indicated by the choice of 
narrowband FM. If high power is available, bandwidth compressive modulation ( ei^t 
phase, etc.) can reduce digital bandwidth below that required for narrowband FM. 

The first step in planning a compatible digital experiment would be to compare 
FM, PCM, anl adaptive delta modulation parametrically, varying signal-to-noise ratio 
and bandwidth. FM would be considered at various quality (signal-to-noise) levels 
and modulation indices, including narrowband. PCM and delta mod* lation would be 
considered at various quality levels (bits per sample, error rate) and with different 
RF modulation methods (QFSK, JEK, eight-phase) . The TRW study assumed narrowband FM, 
and selected FDMA; this study would assume FDMA, and compare voice and RF modulation 
techniques, for equal subjective voice quality. 

The second step would consider the impact of the different modulation methods 
on system design. Given some fixed path loss ana mobile and satellite g/T, link 
calculations would be performed to determine mobile and satellite power. The effect 
of transponder intermodulation must be considered for the different modulation tech- 
niques. Although the total transponder power and bandwidth would be determined by 
the FDMA/FM design, it is possible to have a digital experiment at different voice 
channel power and bandwidth than that of the FPMA/fm design. Several FM channels 
could be occupied by one digital channel, and digital carrier pcwer could be reduced 



by tone loading the transponder. Narrcw band, high paver digital channels could 
be used by leaving some of the transponder bandvldth empty. 

The last step In designing a digital experiment depends on the final design of 
the Public Service Satellite transponder. Durinc the earlier steps, consideration 
should be given to the possibility of designing a competitive digital system. It 
may be that considerations of privacy, transponder intermodulation, or UHF suapllfier 
design will indicate the need for an approach other than the familiar FDMA/FM. 
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This report will review the power budgets suggested for land mobile communications 
by several sources, and will briefly describe three recent papers describing voice 
modulation techniques. • 

The system configuration of a satellite mobile communication system provides 
necessary background for the study of modulation and multiple access methods. We 
should consider if such a system is feasible, if it is EIRP or bandwidth limited, and 
if modulation and access methods affect feasibility. We will examine the parameters 
suggested by several sources, including Sam Fordyce of NASA headquarters, the TRW 
study done for NASA Goddard (and previously reviewed) , the STI proposal, and the PSCS 
brochure published by NASA. The parameters used in a recent demonstration by Dr. James 
Brown of NASA-Goddard are given for comparison. 

Both the up-linK and dcwn-link will be in the 800-9'‘-7 MHz UHF band, ard the same 
antennas will be used for both. Because of the limitation on spacecraft power, the 
down-link is more critical. Table I gives the down-link budgets from the different 
sources. Calculated parameters, indicated by parentheses, were found using formulas 
in the ITT Reference Data for Radio Engineers . The power budgets are similar, due to 
common assumptions. 

The satellite antenna could be 15 ft, pra/iding 30 dB gain and continental US 
coverage, if 20-30 dU/ transponder power is available. The TRV7 budget shoifn has an 
antenna for each time zone, and 2.6 dB more gain. The mobile antenna gain of 3 dB has 
been a ground rule of these studies, but has been ques'*‘ic'*ied by TRW and STI. Hi^ 
gain mobile antennas would ease dcwn-link pov^er budget problems, but would increase 
mobile receiver cost. Servo controlled tracking dishes (1.5 ft - 10 dB) or Yagi arrays 
(0.5 ft - 7 dB) might be workable. A better system might be several directional 
antennas, with the strongest signal selected electronically, or even an electronically 
phased array. The antenna cost should be less than a few hundred dollars. 

For the modulation used, the bandwidth • aries from 10 to 25 kHz. For a 25kHz 
allocation, including guardband, L mhz provides 160 voice channels. The power required 
for each active voice channel varies from 3.2 to 12 Watts. By removing power from 
inactive voice channels, as in SPADE, the average power can be reduced 60fj or U dB, 
and noise is also reduced. Since 40 to 200 voice channels are required, total 
transponder power is very large. 
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Table I: Dovn-link power budgets from jeveral sources 
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The power budgets, and experiments, show that some mobile service can be provided. 
The question of feasibility reduces to the coat per voice channel. Sam Fordyce 
estimates the cost of satellite and launch to be $20 million, and operating costs 
fear ten years life at $16 million. Tie TFW study found that 100 voice channels can 
accomodate 28 preassigned users with 1,000 heavy demand assigned users or 10,000 
light users. Emergency only access or very expensive access would increase the 
total number of users greatly. There are about 300,000 mobile telephone users and 
10 million CB users currently. Commercially available land mobile radio now costs 
$1,000 per unit, and TRW estimated that satellite mobile unita would cost $2,500 if 
10,000 units were produced. The cost of the mobile u.iits, for 10,000 total, exceeds 
the satellite cost. Costs of constructing and operating a central control station 
have not been estimated. 

The large total cost of land mobile stations has apparently dictated the use of 
a simple land mobile antenna and of the UHF band. It has been noted that a high gain 
mobile antenna would allow a smaller spacecraft antenna and less transponder power. 

With two hi^ gain antennas of constant aperture, the net free space Iocs minus the 
antenna gains decreases 6 dB as the frequency is doubled. This further eases the 
spacecraft design, at the cost of hi^er antenna surface tolerance and r^ore expensive 
RF equipment. These options appear closed. 

Under these constraints, the spacecraft design problem reduces to selecting the 
proper mix of antenna gain and transponder power to achieve the required EIRP per 
channel and number of channels. Linear UHF transponders for spacecraft are no^ 
currently available, but should be at least as efficient as those in hi^er bands. 

A reasonable system should probably use antennas as large have been used, 30 ft, 
which would gain 6 d3 over the system in the first column of taule I. Voice activation 
gains another 4 dB, and the power required for 100 Chanels is 10^ W. This is simil. 
to the values given in the second column, by TRW. An adequate system is feasible. 

The limited availability of UHF bandwidth and the engineering limits on EIRP 
provide two limits on the system. The pewer budgets indicate that both FM and 
digital techniques provide the required voice channel power and bandwidti 

We next consider three papers that consider modulation techniques fc" 1-nd mobile 
voice communication. 
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Bruce Lusignan of Stanford has just cnoipleted a study of modulation and voice 
processing for land mobile radio, funded by an PCC UHF task force. FM and SSB are 
compared. The commercial land mobile system is FM, with d =ll7. The voice band- 
width is 16 kHz, and channel spacing is 20 to 25 Idiz. Acceptable voice quality is 
achieved at 15 to 30 dB output SNR, with 30 dB the target. Using the current system 
as a starting point, it was found that sylabic amplitude companded and frequency com- 
pressed SSB can reduce voice bandwidth to 1.7 kHz, and channel sepax-ation to 2 - 2.5 kHz, 
a 10:1 reduction. In the 15-30 dB SNR range, the peak power of SSB is about equal to 
FM, and the average power is about 6 dB less. 

Sylabic cotrpar.iing uses a nonlinear amplitude transfer f'lnction to reduce the 
natural amplitude variations of spee-h by 2 to 1 on a log scale. Companding reduces 
the variability of the speech SN.^. and gives an apparent 15 dB improvement in the 
system tested. 

The speech compression system 'olds the hi^er frequencies over the lower frequen- 
cies, reducing power and bandwidth to 60^- of that norcaaUy used. Because the human 
voice usually produces either low frequency vowels or high frequency consonants, high 
quality voice results. This is a new invention of Harris, Cleveland, and Lott, and 
uses straightforward analog circuits. There is no description of this system, and 
the method ot distinguishing the lii^er and lower frequencies is not obvious. Possibly 
two quadrature modulated SSB signals are used. 

Both amplitude compandirg and frequency compression are done av baseband, and 
can be used with any modulation. The gains of amplitude compa. ding are larger for 
SSB than FM at the low srJR's used in land mobile radio. The study concludes that 
SSB should be used because of its lower bandwidth requirement. The special SSB 
equipment would add $100 to $300 to the current inntalled land mobile unit cost of 
$1,000. The FCC introduction to this study mentions that there are requests for 
four or five times the available spectrum in the 850 to 9^7 NMz band. 

At NTC 77» '''elti and Kwan of COi-SAT compared voice signal processing, modulation, 
and a utiplexint, methods for sate lite telephony. Although SCPC was excluded, the 
study i.' useful. Thxs paper includes the familiar FM and digital methods - FM, comp- 
anded FM (CFM), L and 8 ph-'' ' PCM, adaptive PCM {PStCl’i) , delta modu'* tion (DM) - 

along with some less familiar methods. Nearly-instantaneous- companding (NIC) is a 
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is a method of adaptively scaling PCM samples. Variable slope delta modulation 
(VSDM) Is reported to give good speech quality at 24 to 3^ kbps and 10 error 
probability. In two-pulse amplitude and phase modulation (2P-AFM), two separate 
pulses are transmitted for each sample. Four dimensional quadrature aaqtlltude 
modulation (Ud-QAM) also uses two pulses, 'but has discrete amplitudes and phases 
and Is designed in our dimensions. 

FDM and TDM mu'.tlplexlng methods were considered, both with and without 
time assigned speech interpolation (TASI) for FDM and digital speech interpolation 
(DSI) for TDM. For FDM, the required C/N^ for CF>I, 2P-AIM, PCM/psK, or PCM/4 d-QAM 
Is about 60 dB/Hc. 2P-AfN has the narrowest bandwidth, with CFM and PCM/UD-QAM 
requiring more bandwidth and FM and Pcm/PSK requiring up to 100^> more. Some 
interesting techniques were considered only for TDM, and not FDM. PCM/NIC, DM, 
and ADPCM require 4 dB less than CE't or PCM. DM and ADPCM could operate 

satisfactorily' with a further 2 . dE pcver reduction. For these types of modulation 
with PSK, the bandwidth is about the same as 2P-APM, the narrowest technique studied 
for FDM. UD-QAJ'I reduces the baniwidth another 

Although the results of V.'elti and Kwan are not for ?CPC, they seem to indicate 
chut digical methois are quite competitive. The use of four dimensional, two pulse, 
signal design for bandwidth compression is interesting, and is similar to a previous 
suggestion for research at NASA-AKC. 

Campanella, Suyderhoud, and Kachs compared FM, CFM, and DM for SCPC in the 
March 1977 special issue of the lEFE Proceeditigs on satellite conmunications . This 
article is important to the current study to be awarded by NASA-ARC, and GE and Ford 
reference it -u their proposals. 

The paper discusses SCPC/fM performance in detail. A PLL receiver e :terds 
the FM threshold about 3.^' dP at ^ 3. Pre-emphasis/'de-emphasis gives a net 
improvement of about 5 dB. Sylabic companding provides gains from a few dB for 
wideband FM ( ="^.3) up to 20 dB for narrwband FM ( =2.7). Because the compandor 

is not instantaneous, there is a noise burst after each speech burst. This "hush-hush" 
noise is estir.ated to cause a c dP degradation. 'These valuer were used to compute 
the theor- tical FM performance. 

SCPc/dM performance was also calculated, with BPSK and QFSK modulation. Digitally 
controlled-slope ielta modulation (PCIM) is used, rather than continuously variable 
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slope delta toodulation (VSDM). Both methods adaptively control the delta mod- 
ulation step size. 

Subjective tests were made by comparing degradations to those produced by 
Gaussian noise. The experimental lesults had reasonable agreement with theoretical 
results. The experimental curves of subjective noise vs C/N^ for CJ^l and DCDM 
are very close. The theoret. jal performanct of CFM and DCDM is also quite close, as 
shown by a table of the theore Leal C/N^ at nearly equal bandwidths and a fixed 
noise level. 

Campanella et. al. conclude that, because performance is similar, the choice 
between CFM and DCDM can be made based on ether factors. DCDM is somewhat better 
than CFM in intermodulation suceptability. The DCDM voi e digitizer is not suitable 
for data transmission, but the BPSK or QPSK modulator can be used directly. DCDM is 
better than CFM in speech burst detection for speech interpolation, and has greater 
immunity to carrier frequency jitter. 

These results confirm the conclusion of the review of power budgets, that 
FM and digital modulation could both provide the required power and bandwidth 
for a satellite land mobile system. However, both power and bandwidth constraints 
are tight enough to make the system feasibility marginal. The Lusignan study 
raises the question of SSB modulation, which has bc;bh reduced average pewer and 
the minimum possible bandwidth. As SCPC requires a linear transponder, SSB 
mi^t be useable. 
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iliis section. A, Drielly describes the contents of this final report for 
contract NAS2-y/'0?^ A previous final report, now designated the interim final 
report, was delivered in July, 19yb. This earlier report described research 
in transtorm conditional replenishment, Landsat image compression, and satellite 
communications- The contract was first modified to include additional research 
in tnese three areas and in the new area of simple systems* The contract was 
later modified again to delete tne additional research in Landsat image compression 
and in satellite communications, and to nave reduced additional effort in condi- 
tional replenishment- 

This report contains four further sections. Section B contains the study 
plan, systems survey, and literature search results for tne simple systems study. 
Section C describes an investigation into the performance gain for nonstationary 
image data. The results bound the performance of simple systems. Contrary to 
tne usu assumption, the gain due to nonstationarity is small. Most of tne 
performance gains of adaptive, variable rate compression systems can also be 
obtained for stationary data. Section D describes the computer programs and 
tne coir tressed video data available on the SEL 32. The conditional replenishment 
systeii that was simulated under this contract was described in the interim final 
report, and tne simulation program used is a SEL 32 version of the previous 
program. Section E consists of tne published versions of two papers tnat were 
included as typed versions in tne interim final rep»>rt. 

Additional work in video compression should be done at NASA-ARC. Three 
specific areas are simple systems, conditional replenishment hardware, and 
publications . 

The question of the cost-effectiveness of video compression was considered 
in cne proposal for the additional statement of work, July 2b, 197^5. Only a 
few points will be repeated here. Theoretical knowledge and cost of hardware 
tor video compression define several distinct systems, with different cost, 
transmission rate, and quality. Conditional replenishment has the highest cost, 
and tne lowest transmission rate (1/4 to 1/2 bits per pel) for acceptable quality, 
other sys** .is may be more cost effective. An intraframe compressor witn variable 
"iri liirited buffering has lower hardware cost , and medium rate (l to 2 bpp) 
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lor acceptaole quality. T"'e least expensive fixed-rate intraframe compressors, 
suon as adaptive delta modulation, provide acceptable quality at nigner rates 
(2 to 3 . These systems are all compatible with digital linxs, error correct- 

ing^ codin,% and encryption, but tne simpler systems vould not oe major cost 
components. Simulation and development of simple systems is desireaole. 

ine simulated conditional replenishment system is buildable and effective, 
but mollifications can be made. The I and Q color signals are not tested for 
mode determination or for changes in time, out simulation artifacts majie it 
Obvious ttiat tne hardware snould make these tests. An extensive range of modes, 
quantizations, and change thresholds has not been simulated, since they will be 
optimized in hardware. Some of the features of tne simulation are arbitrary, and 
can be cnanged. Examples include the refresh by lists, and tne absence of directed 
retresn, for recent cnanges witn lower quality. Fundamental changes, wnicn may 
introduce unanticipated artifacts, should be simulated. 

The NASA-ARC video compression project has made many deviations from tne 
direci approacn of simulating, designing, fabricating, and testing tne conditional 
repienisnment hardware. The only justification for tnese diversions is that they 
Improve tne final product. Conditional replenisnment hardware must be built and 
demonstrated. 

The current video simulation has not been presented in a paper, altnougn 
an earlier system was described and video tapes presented at two conferences. 

It' new video material of suitable lengtn and quality can be obtained, a paper 
snouid be ^^^iveu. Other potential subjects for papers include the quasi-cosine 
transforms, Landsat table looK-up compression, nonstationarity of video data, 
and bandwidtn compressive modulation. 
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I. Simple Systems Study Plan 

iask II ol tlie additional statement of work is to review the simple systems 
described in the literature, and to simulate some of the most promissing. The 
responsive proposal indicated that an inexpea^ ive compressor would use only 
intraframe compression, and stated that it was unlikely that any inovation could 
be i*ound in the well studied area of predictive or DPCM compression. The proposal 
also noted that combinations of two different methods, such as hybrid or dual mode, 
were often superior to single methods. 

The literature review for simple systems has indicated an interesting approach 
to the design of compression systems. The non-stationarity of video data is a 
well known design problem^ Various ad-hoc design methods have been developed to 
deal wiili non-st ^tionarity, such as adap^klve predictors or selected quantizers, 
but the basic design is usually made for an assomed average stationary source. 
Berger's work on composite sources (l), and the developement of universal coding 
by Ziv (2), Davisson (3), and others suggest that compression systems be designed 
specifically for non-s tat ionary data. Since the source statistics vary widely, 
the compressor should include widely different techniques. This theoretical work 
explains the effectiveness of earlier systems combining different methods, and 
suggests further developement in this direction. 

The objective of the simple systems study is to devxse a hi^ly effective 
iritra frame compression system, using li tie memory. The approach is to use the 
non-stationarity , by combining several different basic techniques. The use, and 
ref inement,of Berger’s model should make this work new and interesting. 

The simple systems study consists of three phases. The first is a general 
literature review, the second is theoretical and experimental work leading to a 
video source model, and the third is the selection and simulation of various 
cand Idate systems . 

The literature review is divided into three topics, systems, universal coding, 
and noil-stationary sources. 
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il. Basic Compression Systems 

In this section, basic compression methods are reviewed from the point 
of view of complexity and potential performance. The purpose of a teleconfer- 
ence video compressor is to reduce transmission rate as much as possible, while 
maintaii !ng the desired quality. This is accomplished by removing redundancy 
from the data, and by introducing unobjectionable distortion. Greater compression 
requires greater complexity, measured in memory size and number of computations. 
Since the video image samples have correlation in the scan line, between lines in 
a frame, and between frames in time, the potential compression can be increased 
by increasing the memory span to include previous samples, lines and frames. 

Since the video source is nonstationary, requiring varying rate for constant dist- 
ortion, a long memory span is also useful for trading bits within a frame to 
maximize overall quality. 

The amou. t of data required for the compression computation is an approximate 
indication of system complexity. Various general compression systems are listed 
in Table 1, according to the compression data span. This table includes most 
of the '-ifferent basic iniques in the literature. The major headings 1, 2, and 
3 indicate wether the t i span is samples, lines, or frames. The subheadings, 
A,B,C,D,E, indicate the - "ferent constraints placed on the data span, single 
sample, past samples, variable sample span, fixed block sample span, or moving 
f: ed length sample span. Single sample systems can not remove any redundancy, 
and are used after other compression techniques. For example, transform coeffic- 
ients are usually quantized with reduced range, and UPCM values are quantized or 
Huffman coded. Past sample in-line systems, such as DPCM and delta modulation, 
are the simplest systems, and have been studied since the 1950' s. Variable sample 
span systems are also conceptually simple, but produce variable put rate per 
sample. The fixed block techniques are more powerful than past sample techniques, 
but often introduce block edge artifacts. Like fixed block techniques, moving 
window techniques irJ roduce delay so that future samples are used, but edges are 
avoided . 
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These techniques can be extended to several lines or several frames. Some extensions 
are familiar, such as two and three dimensional DFCM and transforms. Line and 
frame repeat are degeneiate cases of differencing or transforming, ■‘n which the 
difference o needed coefficient data are not transmitted. Contouring is the 
two dimensional analog of identifying ^ine sample groups with similar values, 
as in run length c iiig. The three dimensional or time analogs are conditional 
replenishment or motion prediction, which identify two dimensional regions fhat 
persist fo" some variable time span. 

Tabid X emphasizes distinctions made using the data span req« i. 
three vertical divisions indicate the amount of data in memory, while the horizontal 
divisions indicate the data span used in the compression calculation. These are 
two basic indications of system complexity. The memory re. liremeut is equal to 
the data span required in future computations, while the computation complexity 
is proportion - to the data span used in each calculation, for example the transform 
block size. Similarly, the data span is an indication of potentiel performance. 

The use of many previous in-line samples gives little gain ov'er the use of one 
previc s sample, but the use of sample.', in adjacent lines and frames gives sipaificant 
additional compression. The brute force approach to increasing compression is to 
include more redundancy in the computational data span. Even when additional 
lines and frames are included in the memory span, the useful computational span 
remains local. 

There are several other important bases for classification of compression 
systems, including fixed rate/variable rate, and fixed method/adaptive method, 
it is realized that the above discussion of different systems is very brief. 

Further discription of basic compression systems will be givei as part of the 
selection of the systems to be simulated. 
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III. Universal Coding 

ilie theory of universal coding is useful in the coopression of video data, 
even Uiougf' video data is nonstationaxy, and universal coding usually assumes a 
'•tationar <5ource. Davisson and Grey (4) have written an excellent review article. 

^iie classical Shannon theory defines the rate distortion bound for source coBqpression, 
'”.en the source statistics are known. Universal coding considers the problem of 
encoding any one of a class of sources, each with different statistics. Many of the 
proofs of universal coding theoresm are constructive rather than existance proofs, 
and give insist into the design of practical systems. 

Universal coding techniques are usually classified as fixed rate or variable 
rate, and as noiseless or having a fidelity criterion. He first ccxisider fixed 
rate universal coding. Sakrison (5) found Idiat, if the average distortion is 
constrained for all possible sources, the fixed rate required for encoding an 
unknovn source is the largest rate that would be required for any possible source. 

For fixed rate source coding and bounded distortion, the required rate is that of 
the worst case source. Ziv (2) examined the distorticai of fixed rate systems, and 
found that universal codes can be constructed to achieve the mimimum attainable 
distortion for the true but unknown source. For fixed rate source encoding, Sak.- 
rison shewed that the rate must be sufficient to encode the worst source with the 
allowed distortion, but Ziv shewed that the distortion for any source need be no 
worse than if the exact source statistics were known. Universal codes are those 
that achieve a bound for knt^’n statistics, even when the exact statistics are not 
knewn in advance. Neuhoff ,Grey, and Davisson (6) present a unified theory of 
fixed rate universal coding. Different definitions of universality, and different 
conditions on the class of sources are considered. 

Davisson (3) gave a full treatment of universal noiseless coding, and first 
defined the different types of universality. Ziv's universal codes can result 
in a chosen fixed rate that is greeter than the message entropy (the rate required 
for noiseless coding) for the actual source. Using the difference in noiseless 
transmission rate, Davisson shewed the existance of variable rate universal codes 
that achieve the miniaum rate for noiseless coding for any unknown source in a 
class of sources. For fixed rate noiseless coding, the transmission rate must 
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be greater than the entropy of the worst source, as in the fixed rate, bounded 
iJisturtion result oi Jakrison. 

Variable rate universal coding with a distortiao criterion has been exanined 
in laore recent papers by Davisson, Purslay, Makenthun, and Keiffer • 

liie different types of universality first introduced by Davisson are considered. 
Universal codes exist which approach the miniWA rate for the actual source 
selected, while not exceeding a fixed distortion. 

lre> and Davisson (11) give a siaple universal coding theorem. Suppose 
there is a class of sources producing symbols in some alphabet, so that each 
source has different symbol statistics. The number of possible sources in the 
class, m, is finite, and one of the sources is selected initially. The source 
coder reproduces the source alphabet symbols using seme reproduction alphabet. 

A distortion measure defines the average distortion between the source symbols 
and the reproduction symbols, and each possible source can be coded according to 
its own rate distortion bound. That is, for any distortion, D, there exists a 
code book with rate less than or t ,oal to the rate distortion bound R(d), if the 
number of symbols, n, in a coded block is sufficiently large. Since the number 
of possible sources is m, the codewords in each source codebook can be augmented 
using iog^ m bits to indicate whxch source codeboc^ they belong to. The source 
coder encodes a block of n symbols using the shortest augmented codeword in any 
c'odebook. The rats is increased by n log^ ® bits per symbol, and this increase 
approaches zero for large n. Thi’S the rate distortion bound, which is defined for 
large n, is achieved for any possible source. 

I'he uncertainty about the actual source requires added information transmission, 
which causes ihe rate to approach the rate distortion bound more slowly as n 
increases. If the class of sources is infinite, as it would be if indexed on a 
conuLnuous parameter, the distortion measure can be used to divide the class of 
sources into cubclasses , such that each can be coded using a single code that 
nearly’ achieves the rate distortion bound for all members of that subclass. If 
the number of subclasses is finite, the above proof holds. 

The proofs of universal codijig theorems all use a similar fundamental 
appronch. The message describing a source symbol block is divided into two parts. 

The first part iiidicates the method of enc-uding, and the second part contains the 
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encoded data. 11»e method of encoding is indicated in two different ways. In 
constructive proofs, the source statistics needed to define the coding method 
are estimaced and transmitted. For example, the probabilities of the codewords 
define the Huffman code. Existence proofs are similar to Shannon's original 
proofs of the channel'-coding/capacity theorem and of the source-coding/ rate- 
distortion theorem, in that average or typical codes are shown to achieve the 
bounds, but no construction method is given. Hie theorem xtroof outline given 
above is an existance proof, because good codes for each possible source were 
assumed to exist, and used in the universal code. In both constructive and 
existance proofs, the oveiiiead of the first part of the message, used to indicate 
the coding method, is negligible for large source synbol blocks. 

The universal coding coostructive proofs use some interesting ideas. An 
early method was su gg ested for run length coopression hy Lynch (li2) and Davisson 
(l3), and was later ccxisidered in more depth by Cover (l4). Suppose that only 
a small number of the original data samples in a fixed block are to be transmitted. 
This information can be transmitted as the number of samples used, the location 
of the samples, and the value of the samples. The number and values are directly 
encoded, while the location is encoded as the list ordering of the correct permu- 
tation of q samples in n possible positions. If the alphabet size is p, the 
transmission rate for q symbols transmitted out of a block of n is 

R = n“^ (logg q + logg (q) + q logg p ) 

rhe first term is negligible for large n, the second term describes equally likely 
messages which have no potential coding gain, and the third term indicates a sample 
reduction compression gain of q/n. 

The histogram method of Fitinghoff was described by Davisson (3)y and was 
developed before universal coding. The conditloual histogram of data subblccks 
is measured, encoded, and transmitted, along with the codeword for the full block 
constructed according to the histogram. If the source symbols ere independent, 
the histogram does not have to be conditional. For example, the probability of 
each rymbol can be transmitted and used to define a Huffman code for blocks of 
symbols . 
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Ziv ( 2 ) used a different aiethod to obtain the first universal coding 
results. A block of n symbols is broken up Into n/k subblocks of k symbols, 
rile tirst I different subblocks are directly encoded and transmitted, using 
kdog^ P)I bits, where p is the alphabet size. These I subblocks are used to 
construct a table, and subsequent transmissions indicate the subblocks by their 
position in the list, using log^ I bits. The total rate is 

R = k (lo^ p) I + (n/k) lo^ I 

This method causes errors if the list does not contain all the required subblocks, 
but this is improbable for large I and n. The effect of errors is reduced if 
the closest list subblock is used, or if the table is continually revised. 
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itie coding of nonstationary sources is a jaractical problem that has received 
little theoretical attention. Berger developed a fundamental model for nonstat- 
ionary sources (1). A composite source is a finite class of individual sources, 
one of which is selected by a statistical switch. The switching probabilities 
determine the relative amount of time each source is used. If neither the encoder 
or the decoder can be informed of the exact sequence of switch positions, the 
compos it., source is an unseparable mixture of the individual sources. The average 
fraction of each source used, and the symbol statistics of each source, define the 
statistics of the mixture distribution. The only thing that can be done is to 
encode the composite source as one stationary source, having the mixture statistics. 
However, if both the encoder and the decoder can be fully informed of the sequence 
of switch positions, the rate distortion bound of each individual source can be 
achieved. If the sources have very different statistics, the average of the rate 
olstortion bounds is more favorable than the rate distortion bound of the mixture 
source. This model will be further investigated and applied in later sections. 

Ziv (2) extended universal coding to the problem of encoding any one of a 
class of nonstationary sources. This is not the the problem of encoding one non- 
stationary source using Berger's model. A fixed rate, unchanging code is assumed. 

It is assumed that each nonstationary source is an unseparable mixture composite 
source. Ziv finds fixed rate universal codes which achieve, for any selected 
source, the minimum distortion obtainable by any fixed rate code for the exact 
nonstationary source statistics. There are universal codes that achieve the mixture 
statistics rate distortion bound, for any one of a class of nonstationary sources. 

Grey and Davisson (11) consider a model similar to that of Berger. The comp- 
osite source can be considered to be locally stationary, while the switch is in 
one position. If the switch position varies very slowly, then the universal 
coding results for a permanently fixed switch can be applied to the nonstatiouary 
source. The code block length should be short in comparison to the average switch- 
ing time. Grey and Davisson later (4) reemphasize that the correct choice of block 
length allcws universal coding for nonstationary sources that are locally station- 
ary. 
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Jrey and Davisson (4) coded video data (apparently Landsat) using DPCM 
with a i'ixed length code, three different Huffman codes, and a run length coder. 
Although the average sample entropy of the differences was 3*3 bits, selecting 
the best coding method for each block of 64 samples gave a rate of 3 bits per 
sample for noiseless coding. A rate distortion bound less Ilian the mixture rate 
distortion bound was achieved, because of the effective separation of the subsources 
of a nonstationary source. Rice and PiAunt (Ij) earlier used a system of three 
different Huffman codes on similar data, and considered their results to be near 
optimum because they were within 0.25 bits per sample of the mixture entropy. 

More recent authors still appear to consider the mixture rate distortion function 
to be the appropriate bound for nonstationary sources. 

The above work of Berger and of Grey and Davisson provides a new basis for 
the compression of nonstationary sources. Nonstationary sources cem be modeled 
as switched subsources, and the rate distortion of any subsource can be approached, 
if the overhead to indicate subsource switching is negligible. Since the rate 
distortion bound of a separable composite source can be less than the rate distor- 
tion bound of the corresponding mixture source, the consideration of nonstationarity 
provides an opportunity to improve performance. 
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ABSTRACT 


It is well known that the video image source is nonstationary, and the. 
adaptive compression can obtain im.proved performance* In this paper, a composite 
source model for nonstationary sources is developed. This model Illustrates hew 
the improved rate distortion bound for nonstationary sources is used in practical 
design. All the source models assume that the intersample dependency of the video 
samples is removed by a first order one dimensional predictor. The experimental 
statistics of a test set of video images are examined to define the parameters of 
different nonstationary source models. The performance of adaptive prediction, 
adaptive entropy encoding, and adaptive quantization are examined and compared to 
results reported in the literature. It is found that the improvement in rate 
distortion bound due to nonstationarity is relatively small, and much of the gain 
of fixed rate adaptive systems is achieved by more closely approaching the rate 
distortion bound for stationary data. In agreement with quantization theory, 
variable rate entropy encoding of quantizer output values closely approaches the 
rate distortion bound. 
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IHTROWJCTION 


It Is shewn in information theory that memory reduces the information rate 
of a source. A source with memory can be modeled as a composite source, having 
different subsources incorporating the effect of different past symbol sequences 
on the next symbol. At any given time, one of the subsources is selected by a 
switch. When the rate required to transmit the switch position is included, the 
transmission rate of the composite source is equal to the average rate of the 
subsources, plus the additional rate to exactly indicate the switch position. 

The rate distortion bound is defined by the optimum selection of the subsource 
definition, and of the optimum source codin<^ for each subsource. The composite 
source model demonstrates the relationship between information theory and the 
common adaptive source coding method of first selecting the best of several 
alternate encoders and then identifying the particular encoder to the decoder 
using overhead transmission rate. 

The composite source model is used in the analysis of adaptive source codi.,., 
for the video image source. The memory due to the nonstationarity of the tferXov 
model of sample dependence is Investigated, rather than the memory due directly 
to the sample dependence. If the general form of the data distribution is knewn, 
the first order Markov model for the dependant video samples is defined by the 
mean, variance, and correlation of the sample.^. These determine the design o» 
the predictor, quantizer, and quantizer output entropy encoder. The experimei. 
mean, variance, and correlation obtained using wide area averages define the image 
mixture source, which is the stationary source having the same subsources and 
subsource probabilities as the actual nonstationary source. The rate distortion 
function of the mixture source is an upper bound on the actual rate distortion 
function, and the mixture source statistic^ restrict the composite source model. 

Analysis of the mixture source statistics indicates that adaptive predictors 
can not obtain any significant improvement in rate distortion performance. This 
is confirmed by several experiments. The adaptive image encoders described in the 
literature all use adaptive entropy encoders or adaptive quantizers. 


2 





OP’GINAL PAGE IS 
Oh POOR QUALITY 


Experimental results on adaptive entropy encoding indicate that the potential 
gain is less than ten percent, for one test set of video images. Such gains have 
been reported previously for Landsat images. Experiments with adajitive quantizers 
are also in agreement with reported results that rate improvements of about twenty- 
five percent can be obtained. Theory indicates that much of the gain of adaptive 
quantizers is due to more closely approach; g the rate distorticari bound for the 
equivalent stationary source, with ten percent due to the nonstationary memory. 
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T’HK iJWlTt'HilD JliHSaiRCK MODEL KOK SOUROEg WITH MEMORY 


I'Uttiutlcal data dependaacy, due eitiier to correlated data or to nonstationary 
aata statistics > reduces the int'orniat ion traiistnission needed to achieve a specified 
ridel ity. In this section, the switched subsource model for sources witli memory 
Is describtvi, The rate distortion function (rdf) of a source is the minimum trans- 
mission rate required to achieve a given fidelity. 'Ihe switched subsource model 
shows how an improved rdf can be achieved by the identification of subsources, 
which iucovponxte memory of tlie previous data. The soui^ce model is based on Berger's 
coinpa>itt^ source with side information and is fundamental in adaptive source 
canpja;ssion and universal coding. Tlie switched subsource model will later be 
dov».Io))etl using experimental image data, and used to design adaptive image compres- 
sion systetns. 

A composite source., as defined by Berger, is shorn in the left part of 
figure 1. The boxes contahied in the composite source, labeled P(x/s), s 1, 2, ... 
M, are independant subsoiu’ces which produce discrete symbols la tlie common alphabet 
X, , i 1, A. At any given time, one subsource is selected by a switch 

with Known statistics. As In Berger's work, the rate distortion bound of tlie 
composite source is dete?'mined by examining the performance of a source encoder/ 
dc('ovier, wiiidi Is also shown in figure 1. Tlie switch position is determined by 
viircct obstM-vation, or by examination of data which is delayed arid en'*c.xied after 
lh*.‘ swilcl^ [K)sltioii is est.imated. The switch position is described to the source 
symbol i‘ncoiier, which uses the optunum encoder for eacli subsource. Tlie switch 
position inforimition Is then combined with the encoiied symbols, and transmitted 
to Uie decvxier. Tiie dcoovier uses the switch position information to select the 
correct source symbol decoier, 

Berger considered cases where the switch positions are independant random 
variables, or are arbitrarily controlled. Here, the case of a switch w. h memory 
is alsi> considered, bcrgei' considci'ed cases where the side Information d^^scribing 
the r.vM l;ch position was pinivided directly to tlic symbol enc^.xier and decoder. Here, 
the ovei'head ruU' required to transmit the switch position information Is included. 
btMvcr's results includi' limiting cases of tlie model of fig re i. 
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The rate distortion function of tue cooposite source is the oiinlaum of 
the mutusl information of the source encoder and decoder of figure 1. x is the 
source symbol, y is the corresponding decoded synibol, and their Joint probability 
distribution, P(x,y), is determined by the encoder/decoder. The mutual information 
between x and y is 


i(x;y) = ^ P(i»d) 


^ P(i)P(d) 


( 1 ) 


The indices i,J vary over all coodbinations of source symbol x and receiver symbol y. 

The subsource identification information is used by the encoder and decoder 
to select a jjarticular encoder and decoder. The Joint distribution of x and y 
for the subsource k is P(x,y/k) , and the mutual information is 


i(x;y) = 

_ p(k) p(i,.i/k) 

2 ^ P(k) P(x,j/k) lo^ P(k)P(i/k)P(k)P(j/k) 

i,J 

k 

= £ 

Z P(i,j/k) (log^ pjt;^(pjj/i,) - logj P(k) 

- ^ P(i,jA) logg 

k 



- £ P(h) logg P(k) (2) 


k 
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If is the distortion of the k th subsource, the averskge distortion for the 
corposite subsource is 


D = 2 -W 
k 

The rate distortion function is the miniimm of the nutual information for all 
subsource encoder/decoders, or all P(x,y/s)>such that d^ is achieved for each 
subsource. 


R(d) = min l(x;y) 

P(x»y/s) 

‘Sc 

= 2 P(k) Rj^(dj^) + (3) 

k 

where the d are such that the equation for D is satisfied, and the slopes of 
* 1 

the are equal (see Berger pp, l8i».55). rate distortion 

function of subsource k, and H [P(k)j is the entropy of the switch, for indepen- 

sw 

dantly chosen switch positions. The first term of equation 3 corresponds to a 
result of Berger. The equation indicates that the source encoder/decoder of 
figure 1 achieves the rate distortion function of any selected subsource, at the 
cost of exactly transmitting the subsource identification. 

Suppose that, instead of being independant, the switch position during each 
symbol depended on the previous switch positions. Then P(i,j/k) can be replaced 
by P(i,j/k,l) i:. the above derivation, where 1 describes the effect of previous 
switch positions. Since the current encoder /decoder depends only on the current 
switch position, the above result is obtained with H replaced by H t(k,lj. 

\T ^ 

H iP(k,l)j is the entropy of a switch with memory, A switch with memory models 

S\t‘ 

sources where the switch position changes slowly comx>ared to the symbol rate. 
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The rdf of a composite source is the minifflum mutual information, under the 

constraint on the distortions d , for all subsource identification methods and 

K 

for all encoder /decoders. To determine the rdf, the switch position identification 
method must be variable, althou^ the the composite s<»irce definition is unchanged. 


R(D) = min l(x;y) 

P(x,y/s) 

"k 

SW 


( 4 ) 


The only change from equation 3 is the indication, ”sv”, that the minimisation 

Ls made over all methods of switch position indication, as well as all encoder/ 

decoder sets, and the constrained distortion* The switch position indication 

method may vary in different regions of the rdf* If the switch position indication 

can be transmitted at no cost, the fullest possible subsource identification 

information should be transmitted* The subsource definitions are often arbitrary 

to some e' ^int, and the composite source can be modeled as a unifilar source, which 

2 

has eac^ source sytnbol uniquely associated with a subsource , For a unifilar 
source, all the rource symbol information is transmitted as subsource identification 
Lriformation, so that some constraint on this information is necessary* 

When no subsource identification is transmitted, a single encoder /dec Oder must 
be used for all the subsources. As indicated by Berger, the composite source is 
treated as single source, having the mixture source probability distribution. 

k 
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The mixture statistics determine the mutual iofarmation and the rdf. 


i(x;y) 

P(x,y) 

D 


(5) 


where I(x,y) is given by equation 1, Obviously, 


"<“) S “.ix 


If R(D) = (d), the rdf is not improved by using the subsource identification 

information, and the source is effectively stationary. When R(D) < R ^ ^^ (D) , the 
use of separate encoder /decoders for each subsource results in a improved rdf, even 
when the required subsource identification overhead transmission rate is considered. 
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Discussible THE SWITCHED SUBSOURCE MODEL 


The composite source ousdel of figure 1 can he used to represent a source 
with dependant samples, by including one subsource for each possible current 
state, as defined by the relevant past source syaA>ols. The model can also be 
used to represent a source with a variable symbol distribution, by including 
one subsource for each distribution. The different subsources represent different 
past source behavior, which affects the current syndiol output. Removing the 
sample dependancy is the basic method of data compression. Adaptive source 
coding teclmiques, based on a nonstationary source model, usually achieve 
superior compression performance. In this paper, composite source model is 
used to investigate the nonstationary behavior of correlated image data. 

In the image coding literature, many adaptive systems have been developed 
which change the encoder/decoder to optimize performance for nonstationary 
signals. Rate reductions of one-third to cme-half have been achieved, compared 
to the best single encoder /decoder Usually, the method of figure 1 is used. 

Experimental insist is used to partition Uie nonstationary source into subsources, 
an encoder/decoder is designed for each subsource, and the subsource identifica- 
tion and the encoded symbols are transmitted. In an early paper using this approach, 
4 

Tasto and Wintz derived the rate distortion bound of equation 3, which they 
presented as an experimental upper bound on the rdf. For their image data and 
particular subsource definition, the experimental rdf is much more favorable t^an 
the memory less Gaussian rdf, and usually more favorable than the rdf for dependant 
sample Gauss -Markov data. 

Any source encoder/decoder has some implied corresponding source model, or at 
least represents a reasonable compromise between obtaining the rdf and limiting 
the implementation complexity. In practice, stationary data models are 

usually used to design source encoders for the nonstationary image source. Adaptive 
encoders are often only subsequent modifications of stationary encoders, designed 
to overcome the observed effect of nonstationarity. The composite nonstationary 
source model model leads to a more fundamental treatment of adaptive image 
compression methods. 
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Universal coding uses the switched subsource model. Xu uolversal coding 
theory, one subsource is Initially selected from the group of subsources, and 
tlie particular subsource selected is determined by observation of the output 
symbols. The subsource identification is used in encoding and decoding the 
source symbols, and is transmitted to the decoder using overhead rate. Universal 
coding theory shows that universal codes exist which achieve the ardf of any 
possible subsource, llils is true because the overhead transmission is negligible, 
for long block length symbol codes. This method and result are included in figure 

5 

1 and equation 3 above. Gray and Davisson observe that this result applies to 
nonstationgzy sources, if the switch position changes slowly. Gray and Davisson 
elsewhere give an example of noiseless coding for image data, where the average 
transmission rate is less than the mixture entropy. 

In the switched subsource model, the nonstationarity is limited to intermed- 
iate time intervals. For "short" time intervals, usually only one stationary 
subsource is switch selected, and the nonstationary source is "locally stationary". 
For "long" time intervals, the stationary switch statistics produce the symbol 
statistics of the mixture source, and the nonstationary source is "long term 
stationary". If one of the subsources of the composite source is nonstationary, 
it may also be modeled as a composite source, and then cooibined with the original 
composite source so that only one switch is used in the model. If the composite 
source switching is nonstationary, the source can be modeled as a group of 
composite subsources, each with different stationary switching statistics, one 
of which is selected by a second switch. This requires a multiplication of 
encoder/decoders, since the subsource probabilities affect the rate anu distor- 
tion allocated to each subsource. Since the rate and distortion are average 
measures, the loss in modeling a low probability nonstationary subsource as 
stationary is relatively minor. 
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THE urATIOiiAHY IMAGE MODEL AMD EXFERIMEMIAL IMAGE STATISTICS 


The sampled image process is usually modeled as a wide-sense stationary, 

7 8 

first-order Markov process. The stationary source model and the mixture 

sources tat is tics provide important information about tdie composite source model, 
b'or the samples , i = 1, 2, ..., N, the mean, variance, covariance, and 
correlation coefficient are defined as follows: 


^ 

E C(x. -/if] = 

E -A)] = 

E(XjX^) -juu 

By definition of the first-order Markov process 


r 


id 


r U’J I 


( 6 ) 


( 7 ) 


The optimum source coding for the first-order Markov porcess is well known^ ^ and 

requires a simple predictor and difference encoder. The best estimate of x^^, given 

all the X. , for J < i, is identical to the best estimate given x. . . The optimum 
J i— 1 

predictive source coder is sham in figure 2. The optimum estimate of x^ is 
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Figure 2. The optimum predicti 


a first-order 
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Xi * r M 




The predictor bias and mean-square error are 


E(Xi - =0 

E (1-r^) (9) 


In source coding, both the encoder and decoder compute x. , and the difference 
(Xi - x^) is computed at the encoder, quantized, and transmitted to the decoder. 
The rate distortion function of the first order Markov source with Gaussian 
sample distribution is 


2 2 

R(D) = I logg ^ ^ for D < ~ (10) 

1 2 

For r=0, the rdf is that of the memory less Gaussian source, R(D) = ^ logg ( P* /D) . 

or is the variance of the original source samples. The optimum encoder trans- 

2 2 

raits the optimum predictor differences, which have variance ^ (1-r ). 
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£t is of interest to determine the statistics of the experimental image 
test set to be encoded. The statistical measurements corresponding to equation 

6 are as f ollo.:s : 



5 


^ ■ 2 (N-d) 

i 

.2 1 

£ 

i 


® ' 2 ( N- d) 

1 

me . . = TT-r 
iJ N-d 

£ 

i 

(x^-X)(Xj-X) 


my 





( 11 ) 


where i = i, 2, N-d 

and j = i + d 


These equations define the experimental mean, variance, covariance, and correlation. 

The addition of two similar factors, and the compensating factor of l/2, occur in 

X and S so that each of the x^^ are used the same number of times as in mc^^j and 

mr . d is the sample distance for the covariance and correlation measuiement. 
ij 

These statistics were measured for the experimental test set of five images. 

The original samples are quantized to six bit accuracy, and the statistics of 
table I are in units of the least significant bio. Values of mr (d=l) reported 
in the literature are usually between 0.95 scd 0.99» snd this is true for all the 
test images except the highly detailed Band image. The dependence of correlation 

on sample distance corresponds to the first-order Markov model, as has been shewn 

9 f* 2 

previously . The values of X and S reflect the average brightness and contrast 
of the image, and are not indicators of the image information content. 
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Table 1 . Experimental mean, variance and correlation 
for five experimental test images. 


Image 

Mean 

Variance 

Correlation 

Reas oner 

1.32 

94.17 

0.987 

Two Girls 

-10.92 

91.48 

0.980 

Two Men 

- 8.81 

172.62 

0.973 

Writing Pad 12.18 

43.83 

0.979 

Band 

- 32.85 

92.24 

0.871 

Average 

- 3-86 

98.87 

0.953 



% '* 
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The optimum predictor and rdf of equations 8 and 30 are time invrariant for 
a wide £>ense stationary process. If the video source were actually wide sense 
stationary, the statistics of eq ation 11 and table 1 would have the same expected 
valuer, independent of the particular image or region of an image measured. However, 
the video image source is welJL known to be nonstationary, so that using different 
encoder/decoders at different times gives improved performance, fhte interpretation 
of the full image statistics of table I depends on the subsouree switching rates 
considered in the composite source model. If the switching occurs frequently 
during a single image, the full image statistics define different mixture sources. 

If the encoder/decoder is selected once per image, the full image statistics are 
examples of subsources. 

The composite source model to be investigated, based on the stationary model 
of the image source described above, consists of a group of first-order, wide 
sense stationary sources, one of which is selected by a switch. The first-order 
Markov model is specified by /C, , and r. If the subsources are defined uf^ing 

all tl'ree of these parameters, a large number of subsources results. In the most 
accurate subsource definition, the values of/«,r^, and r could be periodically 
measured, quantized to some accuracy, and used to design the encoder/decoder . The 
transmission rate overhead would be prohibitive, unless the subsource switching 
is very slow. The effect of these three parameters on the rdf is examined, and 
experimental measurements are made, to determine the effectiveness of subsource 
definitions based on these parameters. 
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NONSTATIOHARY CORhSIATIOM AMD ADAPTIVE PREDICTIOM 

In this section, the subsources are defined using only the distance one 
correlation, r. The value of r directly affects the predictor, quantiser, and 
entropy encoder, as shown in equations 8 and 9* With the image sample variance, 
the value of r determines the rdf, as shown in equation 10. Mie measured values 
of r in table I have significant differences between images, and areas of Iw 
detail and ..i^ correlation, and hij^h detail with low correlation, are found in 
many typical images, including all five experimental test images. 

The impo’^tance of a source parameter can be determined by the effect a 
source-encoder mismatch in that parameter. We consider the effect of an arbitrary 
predictor on the predictor bias and predictor mean-square error. The arbitrary 
predictor corresponding to equation 8 is 


^i ■ ®^*i-l * ^ 


In general a does not equal r, and b does not equal The bias and mean-square 

of the predictor are 


= E(x^ - a(x^_j^-b) - b) 

= (l-a)(A-b) 

E = E(x^^) - 2 E(x^x^) + E(j?^^) 

= ^ (l-2ar+a^) + (iH-b)^(l-a)^ (12) 

The second term in the me«n-squart error e of the bias* These 

equations reduce to 9 for a - r, b / 
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Since the experimental correlation is usually close to one, it is reas(xiably 
effective to set a = 1, b = 0, in a non^adaptive Image compression predictor. 

The predictor bias is zero, atKi the mean-square error is 2^^(l-r), where r and 

p 2 2 

V are the actual image statistics. The minimum mean-square error is O' (1-r ), 

and is obtained when a = r, b = M». The increase in error for a s l, b s 0, is 

2 2 

^(l-r) , which is small when r is approximately one. 

These results can be used to evaluate the effect of using the a s 1 predictor 
for subsources having r not equal to one. The correlation, error, and rate change 
gain based on the rdf are given in table II for a predictor using the correct value 
of r, and in table III for a predictor using a = 1. A larger negative value of 
rate change indicates that less rate is required, and hi^er couqoression is obtained. 
As expected, there is very little difference in performance of ttie two predictors 
for correlations approximately equal to one. The performance difference is also 
snail for the lower correlations, because of the small potential compression gain 
of the optimum predictor. 

We consider the effect of using a = 1, for two simple conposite sources. A 
typical value for r is 0.95 • If the source is stationary, the use of a = 1 rather 
than a = 0.95 increases the rate by 0.01 bits per sample. Suppose the source is 
nonstationary, having a subsource with r = 1.0, selected with 95 percent probability, 
and a subsource with r - 0.0, selected with 5 percent probability. Tables II and 
III show that the required rate is increased 0.5 bits per sample when r = 0.0. 

Since this occurs 5 percent of the time, the average rate is increased 0.<>25 bits 
per sample. If the subsources were r = 1.0 with 90 percent probability, and r = 

0.5 with ten percent probability, the rate is increased 0.21 bits per sample, ten 
percent of the time, for an average of 0.021 bits per sample. 

The potential additional compression gain of a nonstaticxiary source model 
based oi- Different values of correlation is very small. This explains Habibi's 

3 

observation that adaixtive predictors for image data have not been reported in 

b,i0 

the open literature. Habibi characterizes the method of Tasto and Wintz as an 
adaptive Karhunen-Loeve transform (KLT), but it has been sh(wn that the KLT designed 
for the experimental correlation, like the predictor, is very little better than 
the KLT for correlation one.^^ (The KLT for r = 1 is identical to the discrete 
cosine transform.) The method of Tasto and Wintz uses a source partition based 
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Table XI. Correlatloa, Predictor mean-square error, and 
rate change using the optimum predictor a » r, for the 
first order Markov model with 0, w =1. 


Correlation, r 


Error, ^^(l«r^) 


Rate Change, bits 
0,5 lo^ Error 


0.99 

0.0199 

-2.83 

0.95 

0.0975 

-1.67 

0.90 

0.19 

- 1.20 

0.80 

0.36 

-0.736 

0.70 

0.51 

- 0.500 

0.50 

0.75 

- 0.208 

C.OO 

1.00 

0.00 
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Table III. Correlation, predictor mean-square error, 
and rate change usiixg the predictor a - 1.0, for the 
first order Markov model witli^e 0, » 1. 




O 

Correlation, r Error, 2®^‘"(l-r) Rate Change, bits ' 

0,5 logg Error 

f 


0,^9 

0.020 

—2 

0,95 

0.100 

-1.66 

0.^)0 

0.20 

-1.1b 

0.(^0 

o.Uo 

-0.662 

O.'P 

O.oO 

-0.352 

0.50 

1.00 

0,000 

O . iK ) 

2.00 

+ 0.500 




I 



ORIGINAL ® 

r\\ iAI ITY 


on the exiierimental mean and variance, as well as the correlation. Even though 

the compression gain varies considerably as a function of correlation, little 

additional compression gain is obtained by adapting the encoder/decoder for ^ 

nonstationary correlation. The typical average correlation is hi^, and the 

encoder for hi^ correlation can be used at low correlation, since there is 

little potential gain to be lost. 

This conclusion was tested by examining the performance of adaptive predictors 
on the five test images. A local predictor could be imepiemented by usine tne 
exTerirr.ental mean and correlation of a local sample block, and transmitting the 
local, mean and correlation with the predictor errors . It is desixeable to avoid 
the additional overhead of transmitting the local mean. This can be done by 
setting b equal to the mean of the samples in several preceedlng blocks, which 
requires no overhead. The local predictor is 


x.^ = « b) + b 


The bias and predictor mean-square error are given in equation 12. 


E [(x^-x^)^]= E [(x^-b-x^+b)^J 


E [(Xj^-b)^] 

- 2a E f(x^-b)(x^^^-b)] 
+ a^ E [(x^.^-b)^] 


(13) 


For minimum mean-square error, the derivative with respect to a is set to zero. 

E [(x^-b)(x^_^-b)] 

E [ ] 
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Table IV. gives the luredictor error and rate dtange for three 
different predictors used on the five test liaages. Results are given for a 
equal to one, and for a equal to the full image correlation of table I. The 
largest improvement using the full image correlation, 0.047 bits per sasqple, 
is obtained for the Band image. This indicates that changiaag the value of a 
slow )y with respect to the video image rate can give a small average rate 
reduction. The predictor error and rate change are also given for local blocks 
of four samples. This predictor has smaller error and larger compressicm than 
the other predictors shown, but the overhead rate to describe the value of r 
(with accuracy of ±0.02) is about four bits per block, or one bit per saoq^le. 
The overhead far exceeds the rate ^in. The use of larger and smaller sample 
blocks similarly provides no net rate gain. The use of limited systems with 
two (r = 1.0, 0.0) or four (r = 1.25, 1.00, 0.75, 0.50) predictors reduces the 
subsource indication overhead, but also reduces the subsource identification 
gain. In several experiments, no overall rate ^in was obtained. 

The experimental results are in agreement with the theoretical results 
obtained with the typical f ull image statistics. Adaptive inredictors are not 
useful for image data. This is not unexpected, since none have been reported 
in the image coding literature. The effect of nonstationary correlation on the 
quantizer and entropy encoder is considered belcw. 
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Table IV. Predictor error and rate change for five test images. 

Predictor 


Image 

a 

Error 

= 1 

rate ch. 

a = full 
Error 

Reasoner 

.0271 

-2.603 

.0272 

Two Girls 

.0340 

- 2.440 

.0336 

Two Men 

.0466 

-2.211 

.0461 

Writing Pad 

.0304 

-2.^0 

.0300 

Band 

.2519 

-0.995 

.2359 


image r 

a = local r 


rate ch. 

Err<tt* 

rate ch. 

overhead 

—2 «6oo 

.0180 

-2.898 

+0.882 

-2.448 

.0233 

-2.712 

+0.984 

-2.220 

.0329 

-2.462 

+1.067 

-2,529 

.0186 

-2.875 

+0.898 

-1.042 

.1740 

-1.262 

+1.212 
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NQtCTATIOMAHY FREDICTOR ERRC» VARIAMCE 


In predictive compression, the difference betveen the predicted next sample 

value and the actual value is quantized, entropgr coded, and transmitted (figure 2) . 

The rate distortion function, given in equation 10, is a function of the predictor 

error variance,#^ (l>r^). The predictor error variance depends on both the original 

sample variance and on the sample correlation. Uhen the predictor error variance 

is nonstationary, the optimum encoder/decoder has several subsource encoder/decoders 

with different quantizers and entropgr coders. As in the previous section, the range 

2 2 

and effect of the parameters used to define the subsource, ^ (l«r ), is estimated. 

In following sections, adaptive entropgr encoders and adaptive quantizers are described. 

2 2 2 

The range of gT (l**r ) depends on r and# . The values of r (mr) measured 
using equation 11 on a large number of samples can range from 0 to 1, but are typ- 
ically 0.9 to 1.0 for image data. The experimental image data is quantized to six 

bits, two's complement, and sample values range from -32 to -t-31* The sample values 

2 

often have a uniform or peaked distribution, so that (T can be estimated from the 

data range. For a uniform distribution with range ±A, = A / 3« For A = 31, 

2 2 2 
<T ^ 320. For A = 16, =85. For a triangular distribution with range ±A, #* 

2 2 
= A /6, one-half the uniform distribution value. For A= 31, ^ = I60. For A= 

16,#^ = 42. 

2 

The values of r and (f for the experimental test images are given in table I. 

2 

r ranges from 0.8? to 0.99, with an average of O.96. Three images have <T of 93 12, 

2 

and the other two have (T = 44 and 173, approximately half and double the value 

2 

for the other three images. The Writing Pad (#* = 44) has large regions of white, 

2 

and ‘Pwo Men ((T = 173) has considerable sharp background detail. 

Equation 10 can be rewritten 

R(D) = (1/2) loggOr^ + (1/2) logg (1-r^) - (1/2) log^ D (14) 


The first two terms define the change in the rate distortion function due to changes 

2 2 

in CT and in r. The effect of r on R(D) is given in table II, and the effect of <T 

on R(n) is easily computed. 
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We first consider a source model with subsources having the statistical vari- 
ation typical of full images, as ^iven in table !• Suppose the average of the 
subsources has sample variance g* » correlation r, and rdf R(D). The two subsources 
are defined as foUovs: 

Subsource 1 

= 0.5^, r^ = r, Rj^(D) = R(D) + (l/2) logg (l/2) = R(D) - 0.5 

Subsource 2 

^ = l.i) , pg » r, Rg(D) = R(D) + (l/2) logg 1.5 » R(D) + 0.292 


The two subsources have equal probability, so that the average of the subsource has 
variance When a different encoder/decoder is used for each subsource, the 

subsource encoder achieves the average of the subsource rdf's. 


RgjD) = (1/2) R^(D) + Rg(D) 

= R(D) - 0.104 

The gain of a system adapting on this model, which corresponds roughly to a 
different subsource for each image, is about 0.1 bit per sample. Althou^ small, 
this gain is larger than the gain of an adaptive predictor. 

We next consider the more widely differing subsources that might occur 
within an image. Suppose that subsource 1 consists of hi^ly correlated, lew detail 
or flat image regions comprising 75 percent of the image, and that subsource 2 
consists of the remaining lew correlation, high detail regions, such as edges. 

The average variance is the average correlation is r = O.95, and the rdf is 

R(D). The two subsources are defined as follows: 

dubsource 1. (flat) 

- 0 , ri = 1.0, R^(D) = 0 
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oubsource 2. (edge) 

(y* I = 4 , rg = 0.8, Rg(D) = R(D) + (l/2) logg 4 + (l/2) lo^ (l-r 2 ^)/(l-r^) , 

Rg(D) = R(D) + 1.0 + 0.942 = R(D) + 1.942 

Using the appropriate encoder/decoder for each subsource, the subsource rate is 


Rgg(D) = (3A) Rj^(D) + (lA) I^(D) 

= (lA) R(D) + 0.486 

The relative gain depends on the stationary mixture source rate, R(D), and is 
given in table V. The table shows that substaintial rdf io^Gvements are possible 
for the flat/edge model, but the assuoqptlon that three-quarters of the image can 
be transmitted at 7 'O rate is extreme. 

The next section describes adaptive entr<^ encoders, which change according 
to the predictor error distribution. 
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Table V. Rate gain for subsource coding of the flat/edge model. 

R (D) = (1/4) r(d) + 0.486 
s s 


R(D) 


change 

percent 

1.0 

0.736 

0.264 

2.e.vfo 

2.0 

0.986 

1.014 

50.7^ 

3.0 

1.236 

1.764 

58.8^ 

4.0 

1.486 

2.514 
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ADAPTIVE EWTROPY CODING 


in tills section, we consider noiseless coding. No quantiser is used, and 
the exact predictor difference value Is transmitted without distortion. For 
noiseless coding, the rdf results of the previous sections can be simplified. The 
rdf Is the minimum mutual IrTormatlon, 


I(x;y) = H(x) - H(x/y) (I 5 ) 

where H^x) Is the entropy of x, and H(x/y) is the condltlot^l entropiy of : given 

y. For distortionless transmission, H(x/y) = 0, and R(0) = H(x). The required 

rate Is simply the entropy of x. If the source producing the output symbols Is 

nonstutlonary, the conditional entropy is reduced by considering the memory or 

an equivalent subsource identification. The entropy can be measured directly, 

12 

and Wyner and Ziv have shown that the entropy reduction for a source with memory 
is a bound on the rdf reduction. Specifically, 


i “mix**' - 

The reduction in the rdf of a source due to memory is less than or equal to the 
reduction in source symbol entropy. 

Table VI shows the entropy of the differences and of the difference magnitudes 

for tue five test images. Comparing the difference entropy with the values of r 

2 

and O’ given in table I shows that the general effect of these parameters on 
entropy and rdf agrees with the theory examined in the previous sections. For 
example. Writing Pad has correlation nearly identical to that of Two Girls, but 
has about one-half the sample variance, and the difference entropy is O .89 bits 
less. Two Men also has similar correlation, but has nearly twice the sample 
variance of Two Girls, and has a difference entrojy 0.59 bits greater. Reasoner 
and Band have sample variance very similar to Two Girls, but have higher and 
Icwer correlation and the difference entropy is correspondingly smaller and greater. 
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Table VI, Entropy of the differences and difference magnitudes 
for the five test images » in bits* 


laiage 

Difference 

Entropy 

Magnitude 

Entropy 

Sign 

Entropy 

Reas oner 

2.05 

1.62 

0.43 

Two Girls 

2.72 

2.06 

0.66 

Two Men 

3.31 

2.61 

0.7c 

Writing Pad 

1.83 

1.43 

0.40 

Band 

3.78 

3.10 

0.68 

Average 

2.74 

2.16 

0.58 
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Hov/ever, li‘ the average entropy is used as a reference point, computation of 
the change In entrc^y using equation l4 and t'^e values of 6* and r from table I 
leadi! to some discrepancies, as shovn in table VII. This is apparently due to 
the effect of varying experimental difference distributions. The rdf of equations 

P p 

10 and 14 applies to the Gaussian difference dlstrlbui;i<xi, with variance tf’^(l-r ). 

13 

It has been shewn *' that the entropy and rdf for any symmetrical distribution are 

equal to the Gaussian entropy* and rdf, p'^.us a constant depending on the distribution. 

Therefore, if the test images had the same difference distribution except for the 
2 2 

variance (1-r } , equations 10 - nd 14 would yield the correct differences in rdf 
ar.d entropy. The exact differences are not obtained because the test image differ-* 
ence distributions have different shape as well as scale. This problem also affects 
quantizer design, considered in the next section. 

The entropy of the difference magnitudes is also given in table VI, and is 
used below rather than the difference entrojy. In the first order f<larkov data model, 
the sample values depend only on the previous sample, and the sample differences 
are uncorrelated. This Implies that the signs of the predictor differences are 
indeoendant. The first order model may not be strictly true for the experimental 
data, but using the difference magnitude entropy eliminates the effect of correlated 
signs. The magnitude entropy is the Information contained in all bits except 
the sign bit, and table VI shews that the sign bit actually contains only about 
one-half bit of information. The measured average probability of the difference 
being equal to zero is 0.42, for the five test images. The sign bit has one bit 
of information, but is used only 38 percent of the time, so its average information 
content is 0.38 bits, in agreement with the average sign entropy of table VI. 

In a first experiment in adaptive entropy coding, we consider the performance 
of a single fixed entropy coder used on all five test images. The combined differ- 
ence entropy of the five images, based on the combined difference probabilities, 
is 2.966 bits. A simple approximate Huffman code is shewn in figure and its 
rate performance is given in table VIII. The average rate achieved for the five 
test images is equal to their combined entropy, within the computational accuracy. 
Table VIII also shows that using the optimum entropy code for each image provides 
an average rate reduction of 0.23 bits. This l^^rger than the estimate of 0.1 bits 
for full iimge nonstationarity, made in the previous section. 
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'I'able VII. Computed differeiuses in entroiigr for the five teat ioages. 


Image 

Change in difference 
entropy from the average 
of table VI. 

Change predicted 
using the r 
of table I. 

Heasoner 

-0.69 

-0.87 

Two Girls 

-0.02 

-0.58 

Two Men 

+0.57 

+0.09 

Waiting Pad 

-0.91 

-1.08 

Band 

+1.04 

+ 0.73 
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h'igure 3* ApproKiioate Huffman code. 


Difference 

Magnitude 

Code Word 

Length 

0 

0 

1 

1 

106 

3 

2,3 

nOBRj^ 

5 

‘^,5,6,7 


7 

d,9» • • ♦ , ^5 


9 

l£,17, ...,31 


11 

32,33, ...,63 


12 


S indicates the sign bit, and Indicates the ith magnitude bit 
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Table VIII. Performance of the approximate Huffman code of figure 3* 


Image 

Difference 

Entropy 

Rate for 
Approximate 
Huffman code 

Rate 

Increase 

Reas oner 

2.05 

2.19 

0.14 

'Cwo Girls 

2.72 

2.96 

0.24 

Two Men 

3.31 

3.64 

0.33 

V riting Pad 

1.83 

2 . Ox 

O.lS 

Band 

3.78 

4.04 

0.26 

Average 

2.74 

2.97 

0.23 
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The observed nonstationarity within threat images, and the computation of 
potential rate gain in the last section, indicate that further rate gain can be 
obtained by adapting the entropy coding within the image. Local nonstationarity 
appears as nonstationary predictor error variance, irtiich causes the saoqple differ- 
ence magnitudes to be dependant. The dependence may be significant over a span of 
many samples, althou^ the closest are usually more similar. 

The potential rate reduction due to local nonstationarity can be bounded by 
the entropy reduction for the difference magnitude, when the difference magnitude 
is conditioned on the ptrevious difference magnitude. The conditional ma^itude 
entropy can be found by computing the entropy of pairs of differences. 


/ ^*i “ ^i® ^ ‘ » I *1 ‘ ^ 

- H(| - x^( ) 


The conditional magnitude entropy is shown in table IX for the five test images, 
and is equal to the rate required to transmit noiselessly the current difference 
magnitude, given the previous difference magnitude. The average rate gain is 
only 0.17 bits per sample, or 7 .B percent of the magnitude entropy. The gain 
is hi^ for Reas oner and Band, which have a few large areas of similar samples, 
and is low for Two Girls, which has small detail throughout the image. 

Althou^ the subsource definition and the current difference value are 
mutually dependant, some functional distinction between subsource and difference 
information must be made. Here, the subsource definition will include information 
having significant dependencies between sucesslve sample differences. The subsource 
identification is largely determined by the most significant magnitude bits of the 
closer differences, and has little relation to the sign bits or less significant 
bits . 

We first consider the case where the subsource identification is based only 
on the magnitude of the current difference, but uses the difference correlation. 

That is, the subsource identification is the non- independant part of the difference 
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Table IX. Reduction in entropur for conditional difference magnitudes. 


Image 


•5j) 

H( ) 

Gain 

Reas oner 

3.03 

1.62 

1.4l 0.21 

Two Girls 

4. 04 

2.06 

1.98 0.08 

Two Men 

5.08 

2.61 

2.47 0.14 

Writing Pad 

2.71 

1.43 

1.28 0.15 

Band 

5.92 

3.10 

2.82 0.28 

Average 


2.16 

1.99 0.17 
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ini'ormation. 'llie subsource identification contains some of the difference data, 
and must be transmitted for each difference > Since the same information is trans* 
mitted for independant differences, there is no o«rerhead penalty when there is no 
difference correlation. 

This system was implemented experimentally Iqr dividing the difference data 
by potfers of two, that is, 1, 2, 4, etc., and treating the quotient (the most 
significant bits) as the subsource identification, and the remainder as data. 

The suia of the subsource and data entropy is equal to the original difference 
magnitude entropy, but the subsource identification entropy is reduced when the 
subsource identification is conditioned on the previous subsource identification. 
This data is shown in the second numerical column of table X, for division by 2. 
There is one data bit (with entropy very nearly l.O) and the remaining difference 
magnitude information defines the subsource. As expected, this method has perfomif> 
ance similar to transmission of the difference magnitude conditioned on the previous 
difference magnitude. For division by 4 or 8 there is little information in the 
subsource identification, since most differences are small. 

In another experiment, the subsource identification is set equal to the 
location of the most significant bit in the difference magnitude. There are 


seven subsources, numbered 0 through 6 

as shown. 




7 difference bits 

7 

6 

5 

4 

3 

2 

1 


value 

± 

32 

16 

8 

4 

2 

1 


7 subsources 


6 

5 

4 

3 

2 

1 

0 

magnitude range 


31 

15 

7 

3 

1 

0 

0 


Given the subsource identification (the location of the largest non-zero magni- 
tude bit), the range of the remaiuins magnitude data is limited. The subsource 
identification is conditioned on the previous subsource, and the remaining data is 
entropy coded, conditioned on the subsource. The total entropy results of this 
subsource identification are shown In the third numerical column of table X. This 
and the previous method are similar to conditional encoding of the difference 
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Table X. Total magnitude entropy for several methods of subsource identification. 


Subsource Identification 


Image 

Magnitude 
entropy 
conditioned 
on xnrevious 
magnitude 

Difference 
magnitude 
divided 
by 2 

Locaticai 
of most 
significant 
magnitude 
bit 

Largest 
magnitude 
in block 
of 16 

Standard 
deviation 
of block 
of 16 

Reas oner 

1.41 

1.44 

1.42 

1.42 

1.44 

Two Girls 

1.98 

1.99 

2.00 

2.03 

2.02 

Two Men 

2.47 

2.55 

2.48 

2.44 

2.44 

Writing Pad 

1.28 

1.33 

1.31 

1.35 

1.38 

Band 

2.82 

2.88 

2,84 

2.90 

2.92 

Average 

1.99 

2.04 

2.01 

2.03 

2.0b 
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magnitudes, but are sligjbtly worse In perfarmance end are simpler in Implementation. 

It is more usual, especially in adaptive quantizaticm, to define the subsoujxe 
using some measured statistic of a block of difference magnitudes. Experiments 
wera also performed using the subsource defined as the largest difference magnitude 
in a block, and as the standard deviation of the differences in a block. For these 
two methods, and the five test images, the minimum entropy usually occured for 
sample difference blocks of length 8 or l£, wiHi one minimum at 3^. The data is 
given for both methods, for blocks of l6, in table X. Berformance is similar to 
that of the other two methods described above. A further rate reduction of about 
0.02 bits can be made if the subsource identification is conditioned on the previous 
subsource identification, as for the other two methods. 

A comparison of the methods of adaptive entropy coding considered in this 
secticm is given in table XI. The results are all given in terms of difference 
entropy, by adding the sign entropy of each image (table VI) to the magnitude 
entropies. The result is correct if the difference signs are mutually independant, 
and independant of the difference magnitude, as in the first order Markov data model. 
The simplest method, with the hipest rate, is to use the simple fixed approximate 
Huffman coder of figure 3 for all images. Using the optimum coder for each image 
gives a 7 percent rate reduction. Using the local nonstationarity or subsource 
identification information provided by the the previous difference or nei^boring 
differences gives an additional 6 percent rate reduction. 

The small gains of adaptive entropy coding are not untypical, considering 

14 

results reported in the literature. Rice and Flaunt used an adaptive variable 
length noiseless coding system for image data. The most appropriate code for the 
the sample differences was selected for each block of 21 samples. The system 
produced rates within 0.25 bits of the entropy for test image areas with a wide 

range of entropy. Rates below the average entropy were not sou^t or observed. 

15 

Spencer and May compared Rice and Flaunt 's technique to the method of using 
the previous line statistics to generate a code for the current line. Both mehtods 
gave a 10 percent gain in rate over the optimum full image Huffman code. Davisson 
and Grey combined a run length code, three variable length codes, and direct 
difference transmission, and selected the best method for blocks of 64 samples. 
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Table XI. Comparison of adaptive entropor coding methods. 


Image 

Fixed 

approximate 

Hufftnan 

code 

Sample 

difference 

entropy 

Hevious 

difference 

conditional 

entropy 

Subsource 
identification 
(best method) 

Reas oner 

2.19 

2.05 

1.84 

1.85 

Two Girls 

2.96 

2.72 

2.64 

2.66 

Two Men 

3.64 

3.31 

3.17 

3.18 

Writing Pad 

2.01 

1.83 

1.68 

1.71 

Band 

4.04 

3.78 

3.50 

3.52 

Average 

2.97 

2.74 

2.57 

2.59 

Percent change 

0 

- 7 . 3 ^ 

-13.5^ 

-12.8% 
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They noiselessly comiiressed an image with 3*3 bits per sample average entroior 
to 3«0 bits per sample. 

Althou^ the gain of adaptive entropy coding for the noostationary image 
model Is only 0.4 bits, this gain is an order of magnitude greater than the 
gain of adaptive prediction. The entropy gain is an upper bound on the rdf 
improvement due to nonstationarity. For the five test images, the rate improve- 
ment due to nonstationarity is limited to 0.4 bits, when we go from a simgle 
fixed method for all five images to a locally adaptive method. The rate improve- 
ment is limited to 0.17 bits, on the average, vhen we go from the optimum fixed 
method for a single image to a locc.Uy adaptive method. In the next section, we 
consider adaptive quantizers, which achieve higher rate gains, for reasons not 
related to nonstationarity. 
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The pi'evious section considered entrop/ or noiseless coders used without 

quantizers. This section considers quantizers, both with and without entropjr 

coders. Quantizers reduce the transmission rate, at the cost of reduced fidelity. 

Memoryless quantizers operating Independantly on the predictor errors can perform 

16 

close to the rdf for stationary data. The optimum quantizers have been found 

for several predictor error distributions, including the Gaussian, exponential, 

17, IB 

and gatoma distributions. The theoretically optimum quantizers are scaled 

according to the standard deviation of the quantized variable. The Gaussian 
quantizers typically have one-half the range of the exponential or gamma quantizers, 
and the latter are more suited to image predictor difference data. 

The most efficient quantization method is uniform Quantization foUowed by 

19 /I 16 

entropy coding. Performance is within 1/4 bit per sample of the Gaussian rdf. 

The optimum non-uniform quantizer, without an entro^ coder, requires a 20 percent 

plus l/8 bit rate increase over the Gaussian rdf. Addition of entropy coding 

to the optimum non-uniform quantizers brings performance close to that of the 

optimum uniform quantizers with entropy coding. The perforiaance penalty for ttie 

optimum non-uniform quantizers without entropy coding is larger for the sxponentiax 

20 

and gamma distributions than for the Gaussian distribution. 

Quantizers introduce both in-range and out-of-range distortion. If the quant- 
izer range is less than the full possible difference range, large differences are 
represented by smaller difference values. This causes some large errors in the 
reconstructed samples, but the probability of large differences is small. Such 
errors are visible in images as edge blurring or slope overload. If the smallest 
quantizer interval size is larger than the least significant bit of the original 
data, the reconstructed samples will have small random errors, similar to the errors 
produced by original quantization using too few bits. These errors are visible as 
contouring in the flat, low contrast areas of the image. Quantizer designs are 
optimized by considering these two kinds of errors. If a uniform quantizer is 
gradually widened from the minimum range, the total mean-square error is first 
decreased as slope overload is reduced, and then increased as contouring becomes 
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prouounced. Optimum non-uniform quantizers obtain the best possible performance 
by using larger quantization intervals at the larger, less probable difference 
values. 

The first step in investigating the performance of adaptive quantizers is to 
test fixed quantii.ers for the five test images. The experioeiytal data are six bits 
and sample values range from -32 to + 31« The difference data are Integers, with 
a possible range of -63 to +63 • ®ie difference data for these images, and for 
images in general, usually have ah exponential distribution, and a scaled discrete 
version of some theoretical quantizer could be used. Instead, the experimental 
data were used to design the quantizers. A computer program was written to find 
the minimum mean-sqare error discrete quantize: , by exhaustive search. The quant- 
izers were designed using the difference magnitude probabilities, and therefore 
are symmetrical about the zero difference point. To minia'*j:e the mean-square error, 

23 

differences were assigned to the closest representative value. For equally 
distant representative values, the smaller was used. Quantizers were designed for 
M, the number of representative values, equal to 2, 3, 5» and ?• 

For symmetrical distributions, quantizers with M odd always have zero as a 
representative value, while theoretical quantizers with M even usually have all 
the representative values symmetrical in pairs about zero. For image sample 
differences, the probabil ' ” that the difference is zero is large, and quantizers 
with M odd usually have better performance than quantizers with a one- larger number 
of representaive values. This can be shewn by a simple example. Suppose that the 
probability that the difference is zero is slightly larger than one-half, and the 
probability that the difference is ±1 is sli^tly less than one-half. 


p(0) = 0.5 + e 
p(±l) = 0.5 - e 


If M is equal to 2, the optimum symmetrical representative values are ±1, and the 
mean-square error is 
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MSE (M=2, ±1) = (0.5+e) + (0.5-e) 0 

= 0.5 + e 

If all the differences are represented by zero, the error Is smaller. 


rCE (M=l, 0) = (0.5+e) 0 + (0.5-e) 1^ 

«s 0.5 “ e 

Because of the superior performance of M odd quantizers, the optimum quantizers 
for experimental difference data for M even usually have only M - 1 levels, with 
a zero representative value. M even quantizer designs were obtained by not aliasing 
zero to be a representative value. 

If all the sample differences are represented by zero, no difference data is 
actv.ally transmitted, and the image or line is represented by the first 

sample. Obviously it is much better to transmit a ±1 indication of the sample 
change (delta modulation) than to send no information. The above anomalous result 
occured because the effect of the quantizer is analized as if the quantizer foUcKred 
the differencing loop of figure 2, instead of being within the loop. In such an 
analysis, the quantizer is designed for the original sample differences, not for 
the differences between the current sample and the previous transmitted sample, 
which has been reconstructed from a sequence of quantized differences . Treating 
the quantizer as outside the loop gives acceptable results only when the quantizer 
error is small. Table XII shors for the Reasoner image, the estimated mean-square 
error of the optimum quantizer computed as if the quantizer were outside the loop 
and the actual measured mean-square error when the quantizer is used in the loop. 

For large M, the estimated error is nearly correct, but for M equal to 2, the 
estimated error is an order of magnitude smaller than the measured error. 

The above result indicates that it is unlikely that low rate, hl{^ distortion 
quantizers designed for the original sample differences will give the optimum 
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Table Xli. The optimum quantizers^ computed for the Reasoner image 

sample differences, and the estimated and measured mean-square error. 


Number 
of levels 
M 

Quantizer 
repr esentat ive 
levels 

Estimated 

error 

Measured 

err'^r 

15 

0,1,2,3,4,7,10,13 

0.0229 

0.(^5 

7 


0.1549 

0.2771 

5 

0,1,6 

b.397 

1.266 

4 

1,6 

0.963 

1.714 

3 

0,5 

0.946 

3.807 

2 

1 

2.023 

23.979 
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perfornu«rce in actual use. This observation led’ us@ of an iterative, 

iiitci active program to obtain the best fixed quantisers for the five test images. 

This computer program iteratively designs the optimum quantizers for the actual 
dilTerence distributions produced by the previously tested quantizers. Operator 
interaction i 'ovided to ^rminate repetitive searches and to input additional 
test quantizers. ‘Che best quantizer results produced by this method are given in 
table XIII, for the five teat images. Except for M equal to 3^ performance for 
the Heasoner image is improved. 

The effect of full image nonstationarity can be estimated from the results 
shovn in table XIV. The same quantizations (the optimum quantf ations for the 
Reas oner image ) were used far all five test images. 

Althougti these are not the optimum quantizations for tb^^oup of five test images, 
the results provide an upper bound on the error of the optimum quantizers. The 
increase in mean-square error over that for the best fixed quantizers for each 
image ranges from 12 to JO percent. This corresponds to an average transoiission 
rate increase of about 0.3 bits per sample. Using the optimum quantizers for 
Band for all five test images gave hi^er average mean-sqaure error. 

As we did for adaptive entropy coding, we next consider the effect of local 

nonstationarity within the image. Four different methods of subsource identification 

were : jed for adaptive entropy coding, and all four were found nearly equal in 

performance. For adaptive quantization, we test only one approach, described by 
21 

Ready and Spencer. A block of sample differences is quantized using several 
different quantizers, and the resulting mean-square error is computed. The trans- 
mission consists of the identification of the quantizer with the smallest .uror, 
and the corresponding quantized data. The best mean-square error results were 
obtained using four quantizers, and blocks of four sample differences. These 
results are shown in table XV. 

The experimental rate and mean-square error distortion are plotted for 
Heasoner in figure 4, Several compression methods are included, and similar 
results are given for the Band image and the average of the five images in 
figures 5 pnd 6. 
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Table Xlll. The best fixed quantizers, and resulting mean-square error, 
found by iterative search. 


Number of representative levels 



2 

M = 3 

M 4 

M = 5 

M = 7 

loiage 






Keasoner 

levels 

error 

4 

6.83 

3.81 

1,8 

1.19 

0,1,8 

0.76 

0,1,4,11 

0.214 

Two Girls 
levels 
error 

3 

4.61 

0,3 

2.62 

1,4 

1.11 

0,1,6 

0.79 

0,1,4,9 

0.321 

Two Men 

levels 

error 

4 

12.05 

0,7 

4.68 

2,9 

2.58 

0,3,10 

1.45 

0,1,4,11 

0.851 

Writing Pad 
levels 
error 

2 

2.52 

0,3 

1.05 

1,4 

0.69 

0,1,4 

0.24 

o,1,4,9 

0.1C5 

Hand 

levels 

error 

5 

19.69 

0,9 

9.36 

2,11 

5.15 

0,5,14 

3.48 

0,3,8,17 

1.17 

Average 

error 

9.56 

3.40 

2.14 

1.34 

0.64 
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Table XIV. Mean^square error using the same quantizations for all 
five test images. (ILe Reasoner quantizations of table 
XIII are used.) 


Number of representative levels 



M s 2 

M - 3 

M « 4 

M s 5 

M * 7 

Image 

Reasoner 

8.83 

3.81 

1.19 

0.76 

0.214 

Two Girls 

6.14 

2.17 

1.58 

1.23 

0.370 

Two Men 

12.05 

5.55 

2.85 

2.58 

0.851 

Writing Pad 

5.61 

2.14 

0.89 

0.43 

0.113 

Band 

20.95 

13.04 

6.60 

6.37 

2.87 

Average 

10.72 

5.34 

2.62 

2.27 

0.88 
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i'able XV. Mean-square error for adaptive quantization. 


Transmission rate 

1.5 

2.1 

2.5 

Quantizers 

1 3 

0,1 0,3 

0,1 


6 9 

0,7 0,11 

2,10 : 

Image 

Reas oner 

1.58 

0.39 

0.30 

Two Girls 

1.72 

0.56 

0.56 

Two Men 

3.70 

1.37 

1.05 

Writing Pad 

0.97 

0.21 

0.19 

Band 

9.48 

4.16 

2.25 

Average 

3.»^9 

1.34 

0.87 
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Adaptive quantization gains about 0.5 bits per sample for the Reasoner linage^ 
compared to the best fixed quantizers, but adaptive quantization gains only about 
0.1 bits per sample for Band, and 0.1 to 0.2 bits per sample for the five image 
average. The hi^ entropy, hi^ distortion Band image tends to dominate the five 
image average measurements. Using one method of adaptive quantization for all the 
test images gives better performance than using the optimum fixed quantizer for 
each image, and is more easily implemented. The gains due to between* image and 
local nonstationarity are very similar to those of adaptive entropy coding, where 
coders designed for each image gained 0.25 bits per sample over a fixed code for 
each image, and a locally adaptive entropy coder gained a further 0.17 hits per 
sample . 

The adaptive quantization rate gains for the four test images not including 
Band are about twice as large as the rate reduction due to adaptive entropy coding. 
Since the reduction in entropy is an upper bound on the reduction in the rdf, some 
of the gain of adaptive quantization is due to reducing the inefficiency of fixed 
quantizers . 

Figures 4, 5, and 6 also show the rate-distortion results of two dimensional 
Hadamard transform compression, performed on 8 by 8 blocks of samples. The comp- 
ression method is fixed for each rate, not adaptive, and has been described prev- 
22,23 

iously. The results shown are for independant field compression, rather 

than interlaced frame compression. Frame compression requires more memory, but 
produv,es higher compression gain. The transform coefficient quantizers were designed 
to reflect the observed flat-edge nonstationarity of the test images. The comparison 
of the results in figures 4, 5, and 6 illustrates the well-known fact that predictive 
coding is much better than transform coding at hi^ rates (3-4 bits per sample), 
and much worse than transform coding at low rates (1 bit per sample) In the 

Reai>oner data shewn in figure 4, a mean-square error less than about 0.01 corres- 
ponds to an acceptable compressed image. Transform coding has relatively hi^, 
althougli subjectively unobjectionable, mean-square error at hi^ rates because of 
the large number of computations and coefficients affecting each reconstructed 
sample. Predictive coding with a = 1 is mathematically simple. Transform coding 
is more sucessful than predictive coding at low rates for several reasons. The 
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6 by transform block is tvo dimensional, and includes a larger numiber of 
horizontally and vertically correlated samples. At the latest rates, the best 
transi’orm compression designs effectively perform horizontcLL and vertical sub- 
saiupling. 

tor predictive coding, the effect of 2 to 1 horizontal subsanqpling was tested 
by combining it with adaptive quantization. The results are shown in table XVI, 
and are also plotted in figures U, 3, and 6. As expected, subsampling provides 
improv'ed performance at Icwer rates, while introducing some fixed error at the 
hi^er rates. Those samples omitted by subsampling, are reconstructed by avereiging 
the adjacent reconstructed samples. As a minor benefit, subsampling allows the 
overall rate to be determined more flexibly, by allowing quantizers with larger 
numbers of levels to be used at lower total rates. 

Predictive coding, adaptive quantizer predictive coding, or adaptive quantizer 
predictive coding with subsampling are able to perform better than transform coding 
at high rates down to 2 or 1.3 bits per sample. Although transform coding is much 
better than the predictive methods at ' jwer rates , even the best transform images 
at such lower rates are subjectivv '-Jidcceptable. 

Table XVII shows the entropy of the best fixed quantizers of table XII, and 
also the noiseless entropy fox'm table VI. These results have also been plotted 
in figures 4, 3, and 6. This data shows that the predictive methods using quant- 
ization with variable rate entropy coding sore significsuatly superior in performance 
to the frjred quanti_er methods. Two dimensional transform is superior at the 
lowest ». ates . 

The rate distortion curves for figures 4 , 5 » and 6 can be approximated by 
a strM-i’iit line drawn between the zero rate, 100 percent error rate point at the 
lowe.' iigrit ;:orners, and the noiseless coding rate points on the left axes. The 
plotted results of entropy coding the output levels of the optimum quantizers are 
all within 0.3 bits per sample of the rdf, and are much closer at the lower rates. 
Entropy coding obtains similar significant rate gains for theoretical distributions, 
and a further slight improvement can be obtained by entropy coding the output 
levels of the optimum uniform quantizers. ^»^9»20 
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Table XVI. Mean-square error for 2 to 1 horizontal subsampllng 
with adaptive quantization. 


Transmission rate 

1.0 

1.25 

1.5 

2.0 

2.5 

Image 






Heasoner 

U.056 

2.713 

0.972 

0.545 

0.546 

IVo Girls 

2.926 

2.202 

1.369 

0.951 

0.835 

Two Men 

8.322 

6.276 

3.315 

2.416 

2.387 

Writing Pad 

1.188 

0.953 

0.495 

0.372 

0.314 

Band 

19.106 

16.703 

9.220 

7.241 

7.455 

Average 

7.119 

5.768 

3.074 

2.306 

2.307 
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Table XVII. 


Image 
Reas oner 
Two Girls 
Two Men 
Writing Pad 
Band 
Average 

Maximum entropy 
(equally likely) 


The quantizer output entropy for the beat fixed quantizers 
of table XIII. 


Number of reiiresentative levels 


M * 2 

M = 3 

M » 4 

1.0 

0.728 

1.235 

1.0 

1.323 

1.627 

1.0 

1.069 

1.389 

1.0 

0.874 

1.317 

1.0 

1.136 

1.630 

1.0 

1.026 

1.440 

1.0 

1.585 

2.000 


M e 5 

M « 7 

noiseless 

1.638 

1.751 

2.05 

1.949 

2.134 

2.72 

1.691 

2.404 

2.31 

1.604 

1.610 

1.83 

1.653 

2.098 

3.78 

1.707 

1.999 

2.74 

2.322 

2.807 
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CONCLUSION 


The Itnprcnred performance of adaptive compression techniques is well known. 

The major assumption of this investigation was that the gains of adaptive compres- 
sion are due to the nonstationarity of the image source. This is commonly accept- 
ed. ’ * ' Th^objective was to use the nonstationary source tnriel to obtain 

the maximum a. litional data compression due to nonstationarity. Contrary to the 
above assumption, it was found that the rate distortion bound improvement due to 
image nonstationarity is small, and that significant performance gains in compres- 
sion systems can not be made by considering the source nonstationarity. The gains 
of adaptive quantizer coding are due more to the inefficiency of non-adaptive fixed 
rate quantizers in approaching the rdf, than to the source nonstationarity. 

In the experiments leading to this conclusion, the mean-square error results 
of several different compression methods based on one dimensional predictive 
coding, and one method of two dimensional transform compression were compared. 

The relative performance is in agreement with results in the literature, in that 
predictive coding is superior to transform at higher rates and inferior at Icwer 
rates, and in that variable rate entropy coding gives performance near the rate 
distortion bound. 
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This report describes the computer programs and compressed video data 
developed on the SEL 32 computer under contract NAS2-9703* Previous work was 
reported in the interim final report. Section B, /^pendix D. The interim final 
report indicated that nearly all of the original data four frame sets, monochrooie 
sequences, and color sequences bad been transferred to the SEL 3^. 

All computer programs are in Fortran, but th^ use assead)ly language routines 
found in the Video Library. The programs described here, and preliminary versions 
and minor variations , are all files under USERHAME JONES or USKtHANB IMAGE. The 
programs have also been saved by username on two tapes. Because of the ready 
availability of these files, no listings are attached. 

Table I indicates tdie six major functions of video processing lhat have 
been performed in the NASA^Ames Video Research Lab. The individual programs 
are listed under these functicxis. 

1. Video Data Record 

(none) - records monochrome and color video on digital tape 

- currently unavailable due hardware modifications 

- see old DREG and EREC, format information belxw 

2. Test Video Displry 

JRAMP -Tests the display link, using a partial frame ramp on frame 1 

- ACTIVATE JRAMP 

KCOL - tests the display link, using horizontal color bars and letters 

- ACTIVATE KCOL 


3. Video Display 

- all programs display both D and E format, color or monochrome 
DISP ^,6 - displays all frames on a tape up to a double EOF 

- 5 for Nf^lC, 6 for NTjU 

- ACTIVATE DISPLAY? ACTIVATE DISPLAY6 

JDISP - displays frames from selected tape onto selected frame 

- requires both tape mounts and TYXX ^ 

- ACTIVATE JDISP 
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Table I. SEL 32 Program Functions 

1. Video Data Record 

2. Test Video Display 

3. Video Display 

U. Video Digital Tape Reformat or Copy 

5. Video Compression 
Transform Programs 
Differencing Programs 

6. Video Data Analysis 
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(Video Display continued) 

(none) - reformat tape for DICOMED 

- not available 

4. Video Digital Tape Reformat or Copy 

DORE - converts D to E, E to D formats 

- ACTIVATE DTOE, ACTIVATE ETOD 

- to catalog, USE DORE, CH/D0RE//DT0E/, RUN 
ECOPY - copies a D or E tape, with video display 

- catalog into JONESMQD 

ERTOY^ - converts YIQ to RGB, RGB to YIQ 

- option to select frames 

- requires TY60, discfile*^ 

- catalog into JONESMCD 

DFMT - converts DINT (frame) to DSTD (field), DSTD to DINT 

- requires TY60, discfile 
> catalog into JONESMQD 

5. Video Compression 

Transform Programs 

E8x 8 - 8x8 Hadamard transform, monochrome and color 

- input E tape M710, output E tape M7H 

- requires TY60, discfile 

- uses CR?8 cards to define bit assignments and quantizations 

- catalog into JONESMQD 

T8x 8 - like E8 x 8, includes cosine and quasi-cosine transforms 

- additional cards to define post Hadamard vector transform matrix 

- catalog into JONESMQD 

EbaE - 8x8 Hadamard conditional replenishment 

- use e88fA for multiple tape input 

- like e8x8 

- additional cards for refresh lists, change threshold, mode threshold 

- catalog into JONESMQD 
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(video Compression, transform, continued) 

FRAVG - averages E frames in time 

- for output of E86F, SOTOY 

1 o 

- requires Tf60, disdfile 

- requires tape input, tape and/or display output 


Differencing Programs 

DC0R8 - adaptive prediction 

- D tape input on H710 

1 3 

- requires TY6U, discfile * 

- catalog into JONESMOD 

DCORC - prediction and adaptive entropy coding 

- like DC0R8 

DCOMP - prediction and best theoretical quantizers 

- D input on M710, D output on M7H 

1 3 

- requires TY60, discfile 

- catalog into JONESMCD 

DCOMI - iterative, interactive quantizer selection 

- like DCCMP 

DCOMK - adaptively selected fixed quantizers 

- like DCOMP 

UCOML - subsampling, prediction, adaptive quantization 

- like DCOMP 


6. Video Data Analysis 


LCOR - measures in-line and between line correlation 

- like DC0R8 

TQUAN - design optimum quantizers, as in DCOMP 

- place in workfile, RUN 

- requires 'i'flOC 

DDIP - measures M3E between D tapes 

- not debugged 


*v ' 
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Notes ; 


1 T'i'XX assignments may be changed by recatalogging 

2 Files above ERTOY are in USERNAME BtAGE, ERTOY and below are In 
irSERNAMEJONES 

3 Discflles must be created before program use. Use the Job outline 

in subroutine WRDISC to create the discfiles. WRDISC has many versions, 
so use tlxe one in the program to be run. 

4 The cards to run the transform compression prograom are in the card file 
drawers . 


We next consider the video tape formats. These have been transferred 
basically unchanged from the SEL 840, but the data is not in ei^t bit format 
rather than six bit format. The video data tape conversion programs, discussed 
in the interim final report, expanded the data bits but retained the original 
formats. 

Video Tape Format Definition 

The %'ideo data digital tapes consist of 1) title records, 2) data records, 
and 3) «nd of file marks (EQF's). 

1. Title Records 

Each video tape file has one standard title record of 25 words. A computer 
word is 32 bits. 

First word - tape identification (A4) 

Words 2 to 10 • miscellaneous alphanumeric and integer information, including 
date, etc. (15A4) 


5 



OmOtNAL PAGE B 
OF POOR i^ALITY 


V/ords 11 to 25 - descriptive title (15A4) 

Word 25 - format identification, ISTB, etc. (A4) 

2. Data Records 

Each video data digital tape has maniy data records based on the D format 
video line. A D format line has IO7 words of ^ bits each. 

First word - number of data words ( always 104) , an it.teger 
Second word • line number (6 to 237, or sli^tly hi^er) , an integer 
Third word - video field or color number (0 to 7) an integer 
Words 4 to 107 - Two's complement data, packed eig^t bits per picture element, 
four elements per word 

A D format tape record consists of one D format line. An E format rec~ *d 
consists of ei^t . format video lines, joined in sequence. 

3. End of file marks (EOF's) 

Each video tape file is marked by an EOF. Each tape has a second EOF after 
the last file. 

A file may contain one, three, or four video frames. The first monochrome 
data gathered to simulate 4 by 4 by 4 Hadamard transform operation, has four 
monochrome frames in time sequence in each file. Test images taken from these, 
and longer sequences made using the video disc analog recorder, have only one 
monochrome frame in each file. Color files always contain three frames, one for 
each primary color. 

4. Video Line Order 

DSTD/ESTD. For both D and E format tapes, the standard line order is in 
the order of the temporal video scan. The lines are ordered 6 to 237 in field 0, 
then 6 to 237 in field 1, and so on to field 5 or 7 if there are three or four 
frames in the file. 
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(Video Line Order continued) 

DIirr/EINT. For both D and E format tapes, the interlaced format is 
or^'anized by frames, rather than fields. Each line in the first or even field is 
followed by the corresponding (same numbered) line in the second or odd field. 
Relatively fev data files are in the Interlaced format, but conversion is simple, 
attd the compression programs accept interlaced input. 

'j. Original Video Data Tapes 

Many monochrotno files (DlOl-125, D151~170) consist of ei^t fields in 
time sequence. Other monochrome files are a single frame (two fields) forming 
part of a sequence of about 85 frames (ETOl, ETO3, ETO6, ETO7). 

Color files are either RGB or YIQ in frames 1,2, and 3» respectively. Two 
color sequences of 59 (ECO9, 10,11) and 46 (£C12,13) frames exist. The first has 
been reformatted in reverse field order. Single frames of hi^er color quality 
have also been obtained. They were recorded in the original D color format, 
which had a line given in each color sucessively, instead of by frames in each color. 
The files DC13-22 have been converted from old DIM?, RGB to new line order EIMT, 

YIQ using DFMT , DTOE, and ERTOf. DCOl-10 could be similarly converted for use 
wi^h current compression programs. 

6. Compressed Video Data Tapes 

Several demonstration tapes were prepared for a review in mid-January. These 
are described here by tape label. 

T1 - DC'3-22, EIMT, processed by T8 x8 

- frame interlace, 8x8, cosine, color 

- 2 bpp, 1.5 Y, 0.25 I and Q 

Cl - Man and Tool, 85 frames, E88 f 

- 1/8 bpp, monochrome, Hadamard, 8x8 

C2 - Wheel, 59 frames, E8 Bf 

- 1/4 bpp, color Hadamard, 8x8 



original page is 

OF POOR QUALiry 


Al to a 8 • Wheel, 23U frames with reversal and two runs, E88FA 

- 1/2 bpp, color Hadamard, 8x8 

- new 2.3 bpp Y edge mode, new edge threshold 
DEM022 - Reasoner, monochrome 

- results of 21 wavafcrnt compression tapes 

- original, Hadamard 3 and I.3 bpp, DFCM 3 snd 1.6 bpp 

- adaptive DFCM and subsampling 1.3 bpp 
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Two conference papers have been presented since the interim final report. 

The topics were "Comparison of Video Fields and Frames for Transform Compression," 
by Harry W, Jones and Larry B. Hofman," and "The Karhunen-Loeve, Discrete Cosine, 
and related Transforms Obtained Via the Hadamard Transform," by H. V. Jones, 

D. N. Hein, and S. C. Knauer. Original drafts of these papers were included 
in the interim final report, as appendices A and B of Section B. Copies of the 
papers from the proceedings are included here. 
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Abstract 

'Because of the interlaced television scan* the two fields that form an interlaced video 
frame are generated 1/60 of a second apart. If the two iields are compressed inde- 
pendently, the correlation between adjacent lines is unused. The transmission rate can be 
reduced by using a field memory to form an interlaced frame. Four test images^ were 
processed as fields and as interlaced frames, using both theoretical and experimental 
compression designs. For comparable mean-square error and subjective appearance, field 
compression requires about one-half bit per sample more than frame compression. However, 
the overall transmission rate the number of bits per image time^ the number of images 
per second — is more meaningful than the number of bits per sample. When transform 
compression at low transmission rates merges the adjacent lines, frame compression becomes 
similar to field repeating, and the memory can be reduced. 

Introduction 


This paper describes the results achieved, and the hardware required, for video com- 
pression using either fields or interlaced frames. Interlacing the video fields, and the 
inverse operation, requires substantial digital memory, but achieves a given compressed 
( image quality using a lower transmission rate. 

In television transmission, the scene is repeatedly scanned to form a field image of 
about 2S6 lines. As shown in Figure 1, each field consists of every second line in the 
full frame, and the alternating fields (r'^presented bv solid or dashed lines) are displaced 
vertically by one line. Two successive fields form the full video frame of about S12 
lines. Fields are transmitted at the rate of times per second, to avoid the objection- 
able flicker effect w* *ch occurs at lower rates, even though 30 or fewer images per second 
arc sufficient for motion representation. (1) 

^ It is possible to transmit sampled images at reduced bit rates because much of the in- 

formation in samples taken at the Nyquist rate is redunaant. The successive samples are 
not independent, and the video process can be described by a first order Markov model 
This Markov model fits the measured correlation of the four test images used here, as shown 
I in Figure 2 for the image of newscaster Harry Reasoner, and in Table 1 for all four test 

images. The image frames usually have the highest correlation between adjacent samples in 
] adjacent lines in a frame (different nelds), the next highest correlat^u between adjacent 

samples in the same line, and the lowest correlation is between corresponding samples in 
the closest lines in a field (separated by an alternate field line). This is as expected, 
from the 4-to-3 width-to-height aspect ratio of the video frame, and because the four test 
frames have low motion or change between fields. 

^ Because higher correlation allows more t ransiiission rate compression, the correlation 

values indicate that it is most effective to use samples in the adjacent lines of alternate 
fields, next most effective to use samples in tf»c current line, and least effective to use 
^ samples in the clo^^st lines of the same field. The cost of using these samples is the 

i memory required to store them. Using samples m the same line requires a few s ^ pies to a 

1 line of samples to be stored; using the closest lines in the same field requires several 

lires of memory; and using samples in the adjao'nt lines of the alternate field requires a 
full field of memory. Obtaining the lowest possible transmission rate can require much 
; more memory than less efficient systems. This t'ffect is also apparent in conditional 

1 replenishment systems, which use the correlatio’i between successive frames in time.^^^ 

‘ 

* *The research work leading to this paper was performed under co..*ract NAS2-9703, sponsored 

by the National Aeronautics and Space Administration, Ames Research Center, Moffett Field, 
C'llifornia 94035. 
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COMPARISON OF VIDEO FIELDS AND FRAMES FOR TRANSFORM COMPRESSION 


Transform Compression Systems 



Computer simulations of video linage compression systems were undertaken to compare the i/.' 
performance of field and frame compression. All experiment involved single image 
compression of monochrome television images. Digitized images were obtained by sampling a 
standard NTSC baseband signal at 8.064 ^ 10^ samples per second, tiach sample was repre- f. 

sented b^* a six-bit integer. The visible area of the images has 416 samples per line and . 1% 

464 lines per frame. The nominal 512 samples and 512 lines includes samples in the ; ^ 

horizontal retrace and lines in the vertical interval. The television images were com- | 

pressed both as fields of 232 lines and as interlaced frames of 464 lines. 

I . 

The compression experiments used Hadamard transforms of 8 by 8 subpictures. The co- ' 

efficients of the 64 Hadamard vectors are u*=ed to represent the subpictures. The 8 by 8 ' 

Hadamard basis vectors are shown in sequency order in Figure 3. Sequency is defined here 
as the total number of white-black and blacK-white transitions^ in ^he horizontal or 
vertical directions. If the Hadamard transfo'^m is not normalized, the vector coefficients 
have a possible range of 12 bits, since each i'* the result of 64 additions or subtractions 
of 6 bit numbers. The vector coefficients were first rounded to the 8 most significant ^ 

bits, and then quantized to an 8-bit integer. transmission rate compression is achieved by ' 
using fewer than 64 quantizer levels, and indies ting each using a code word shorter than 6 } 

bits. In the final compressed picture, each sanple is represented using a 6-bit integer, ;f 

as in the original image. 


• Figure 4 shows the hardware organization of tne independent field transform compressor. 

The input lines are converted to digital samples, stored in an 8- line memory, transformed 
in 8 sample by 8 line blocks, and quantized. The quantized bit stream is transmitted, and 
the inverse process is used to generate analog video. The field compressor uses the 
correlation between the samples in a line and between tiie lines in a field. Some correla- 

♦ tions are not used, since each 8 by 8 subpicture is processed independently. / 

# 

Figure 5 shows an interlaced frame transform compressor, which performs the same " 

functions as the field processor. It differs because t.ie 8 by 8 subpictures are taken from 

an interlaced frame, rather than from one field. The 8 by 8 fiame suhpictures have one- J 

♦ half the height of field subpictures. In order to interlace a frame, the first field is 

' held in memory until the second field is being generated. The fields are then interlaced ^ 

and the subp^ctures are transformed- Information on the two fields is partly transmitted 
and partly stored during the second field time, and the stored information is transmitted 
/ during the next field time. The receiver output display ii; not synchronized to the data 

♦ transmission, as it was in the field compressor. To provide the co .eel display, the 

t receiver requires a compressed meinory to hold the frame in the tran:>niitted form. This 

memory is decoded twice, to provide the two fields. The memory required is one field at 8 
bits and one-half of a compressed frame (assumed to require 1 bit per sample) at the 
encoder, and one compressed frame at the decoder (assumed to require 2 bits per sample), or 

♦ the equivalent of one frame at 7 bits (i.e., 7 bit frames). This is the cost of using the 

« correlation between adjacent lines in a frame rather than the correlation between the 

closest lines in a field. 


The interlaced frame compressor of Figure 5 uses transmitter memories which are 
alternately fully written, and then fully read once, so that the memory information is of 
no further use during one-half of the memory cycle. A revieweri^^ of this paper has 
devised a frame interlace system which improves memory efficiency by reading four lines 
from memory, and then immediately replacing them with four current lines for future use. 
This is made possible by redefining the concept of an interl aced fr-^me; rather than using a 
frame of two successive fields, the frame definition is changed every four lines, ^'or the 
current input field, the first four lines are combined with four lines from the ’oi:s 
field, the second four lines are placed in memory for combination with the folio’*' r fiei*<, 
and so on alternately every four lines. This improved design requires one-half J of 
uncompressed memory at both the transmitter and receiver, for a total memory requirement of 
4 bit frames. 


System Simulation Resul ts 

Figure 6 shows the mean-square error results obtained (in units of ’ le least significant 
of the six original bits) when the Harry Reasoner test image was compressed usinv*, 
theoretical compression designs. The different comprcss’on designs c*'nsi«t of t^ho bit 
assignments and quantizers for each of the 64 Hadamard vector f f i c i enr ** fhc theo- 
retical designs assumed the first order Markov conclation model (with the "assm ed in 
design'* values of Table 1), an exponential distriMi^on for the transform vector co- 
efficients, and the mean -square error measure. At c same rates, field compression 
produces larger error than frame compression, or, equivalently, ficl«i compression requires 
more rate for a given error. However, there are two cases in *)ie field data, and one case 
in the frame data, where 1/2 or 1 bit per sample incrcas.^s in the t r.insm i ‘-v mn i *te produce 

SPf£ Vo/ f49 Apfthcattons of Oiqftal fmaqv Pr yresstnq 1 1978/ • 2/b 





OP POOR QUAU1Y 


J(m$.HOFMAN 


kittle or no reduction in error. The theoretical designs obviously do not uake the best 
possible use of the transmission rate. 

Figure 7 shows the mean-square error obtained when the Harry Reasoner test image was 
expressed using experimental compression designs. The curves are smooth, and added rate 
always reduces error* The experimental designs give much lower error than the theoretical 
designs. The field compression curve for tV experimental designs (Fig. 7) is nearly 
identical to the frame curve for the theoretical designs (Fig. 6), from 4 bits per saoqile 
down to 1 bit per sample. The experimental designs used are similar to designs obtained by 
trial and error, but were generated using a formal i ted procedure based on the requirement 
of good representation for both the edges and the low detail areas in video images. For a 
full discussion of the theoretical and experimental designs used, see Ref. 6. 

Figure 7 shows that, over most of the range of transmission rates, field compression 
requires a transmission rate about S0% greater than frame compression, for the same mean- 
square error. At the highest rate shown, 4 bits per sample, the mean- square error is 
caused by rounding all the transform vectors to S bits, and all methods give about the same 
error. 

Figure 8 show^ the^^nean^$quare error obtained using the experimental compression designs 
on all four test images, in both fraaie and field compression. Because of the wide range in 
the detail and correlation of the t^st images, the mean-square error at each transmission 
rate ranges over an order of magaitude, and the mean-square error is plotted on a log 
scale. Even though the test iiraees differ greatly, the parallel curves of Figure 8 show 
that the increased rate requi^ field compiessioc is nearly constant, about 1/2 bit 

per sample, for these images range of tr4tnsiiission rates between 1/2 and 2 bits per 

sample. It seems that the fram^* or field compressiori trade-off can be summarited as 4-7 bit 
frames of memory for 1/2 bit per sample in transmission rate. 

The subjective impressions of the compressed images agree in quality ranking with the 
mean- square rror results. Figure 9 shows the originrl image of Harry Reasoner. Figure 10 
shows thi!> image conpress«^^ using 1 bit per sample in the frame, and Figure 11 shows it 
'ompressed using 2 bits ^ sample in the field. The compressed images exhibit edge 
degradation, especially at the shoulders, lip.^, collar, and tie. The field image at 2 bits 
per sample has somewhat higher quality than the frame image at 1 brt per sample, as ir 
dicated by the error values of Figures 7 and 8. (Originals of all the test images ar< 
shown in ? *f. 6.) 


Time Effects 

The above comparison of frame and field video compression considered only the quality of 
the individual images, and ignored the effects of motion and the timj sequence of images. 
The two fields in a frame are generated 1/60 of a second apart in time, and motion tends to 
make the correlation between adjacent lines in a frame lower than ^he correlation between 
the closest lines in the aame field. A fifth test image, of a bltwred hand moving rapidly 
over a writing pad, was compressed in the same way as the four other test images. Because 
of the mot ion » the mean- square error was lower for field compression. (An original of the 
pad image is shown in Ref. 7.) Transform compression, especially at lower transmission 
rates, te*ids to average adjacent samples and lines. Two fields processed as a frame become 
similar, and high motion areas where the original iields differ become blurred. In frame 
compression nf high motion scenes at low transmission rates, the motion update rate is the 
frame rate, 30 per second, rather than the field rate, 60 per second. Because the frame 
rate is adequate for representing motion, this is not an impairment. 

These observations suggest a third approach to video compression. Since frame 
processing tends to average the two fields at lower transmission rates (which would reduce 
vertical **esolution and the motion update rate to one-half their original values), 

Jiarae compression is more similar to field repeat compression than to independent, two- 
field processing. In field* lepeat compression, only one-half t^e fields are transformed 
and transmitted, and each trapifmitted field is displayed twice at the decodrr. Figure 12 
shows the block diagram of a field-repeat compression sys.cm. \s the field to be trans- 
mitted :s generated, hal. S current information is transmitted and half stored. The 
full compressed field is r.‘tained in the decoder, for repeated display. The total memory 
requirement for field-reneat compression is 1 1/2 bit frames, rather than 4 or 7 bit frames 
f jT frame compters ion, A real time hr- /are system using the Hadamard transform and ^ield 
**epeat has been previously described. 

Since a single field his only one-ha*.f of the samples in f* frame, the same overall 
transmission rate is obtained when the *ir er w .* bits per transmitted sample is doubled for 
fieltl repeat. The overall rate is the uUi«ner of pet image multiplied by the number of 

images per Svcond. Field repeat compression transmits only one-half the field images used 
in frame or field compression, as discussed above. The previous mean-square error results 
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also indicate the performance of field-repeat, since the same error in each transmitted 
field is obtained if one or t«fO independent fields are transmitted. Field-repeat 
compression at 2 bits per sample has the same error as field compression at 2 bits per 
sample, but the overall transmission rate for field-repeat corresponds to that for frame or 
field compression at 1 bit per sample. Field-repeat compression can be compared to frame 
or field compression in Figures 6-S by moving each point on the field compression curve to 
a point having the same mean- square error and one-half the transmission rate. This shoms 
that the error is slightly lower for field-repeat compression than for frame compression, 
and much lower than for field compression. 

Figure 15 >ws a field-repeat image of the first field of Figure 11. This field-repeat 
insge r^uires e same overall transmission rate as the frame processed image in Figure 
10 , having one t eld at 2 bits per sample rather than one frame at 1 bit per sample, and 
the subjective quality is similar. The field-repeat image has lower quality than the field 
compressed image, but that imaf’e has two independent fields at 2 bits per sample and 
requires twice the overall transmission rate. In field-repeat, vertical resolution is 
noticeably reduced in detailed areas, and quantization noise and contouring are more 
apparent in background areas. It should be reemphasized that field- repeat compression is 
appropriate at the lower transmission rates, where it is not possible to provide the full 
potential resolution of uncompressed video. 

Conclusion 

Experimental simulations of interlaced frame and independent field compression systems 
indicate that frame compression can achieve a transmission rate about 1/2 bit per sample 
lower than field processing at a given image quality, with the added requirement of 4 or 7 
bit frames of memory. Frame transform compression can be used at lower transmission rates 
than field compression, but replaces the two fields in the frame with two similar combi- 
nations of ^he original fields. If it is decided to use only one field in field-repeat 
compression, performance similar to frame compression at low transmission rates can be 
obtained using only 1 1/2 bit frames of memory. A conditional replenishment compressor, 
which uses the correlation between successive frames, can be implemented using 7 bit frames 
of memory. 
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Table 1. Range of the correlation patameter R for D equal to 1 through 7. where R « 

C is measured correlatio n an d D is sampl e d ij^ tanc c 

Pictur e In-li me Between I Incs in Tiela fteYwi .^n TTnes in frame 


Rcaso .er 
Two girls 
Two men 
Band 


0.966-P 987 
0.96S-0.982 
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0.8470.882 


0.967-0.975 
0.946-0.950 
0.955-0.946 
0.8070. 877 


0.9S4 0,989 
0.9^2 0.978 
0.968-0,972 
0.888-0.916 
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Fig. 1. The interlaced videofraae, 
with the first field given by solid 
lines and the second field given by 
dashed lines. 
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Fig. 6. Rate versus necn-square e^»*v.r 
for the Harry Reasoner image compressed 
using theoretical compression designs. 



Fig. 7. Rate versus mean^square error 
for the Harry Reasoner image compressed 
using exper irac n ta 1 compression designs. 
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ABSTRACT 


The Karhunen-Loeve transform for stationary data, the discrete cc»sine tra.. form, the Vlalsh- 
Hadamard transform, and most other coanonly used transforms have one-half even and one-half 
odd transform vectors. Such even/ odd transforms can be implesffinted by following a Walsh- 
Hadamard transform by a sparse matrix multiplication, as previously reported by Hein and 
Ahmed for the dlacrete cosine transform. The discrete cosine transform provides data com- 
pression nearly equal to that ot the Karhunen-Loeve transform, for the first order Markov 
correlation model. The Ualah-Hadamard transform provides most of the potential data compres* 
slon for this correlation model, but it always provides less data compression than the dis- 
crete cosine transform. Cven/odd transforms can be designed to approach the performance of 
the Karhunen-Loeve or discrete cosine transform, while meeting various restrictions which can 
simplify hardware implementation. The performance of some even/odd transforms is compared 
theoretically and experlmeatally. About one-half of the performance difference between the 
Walsh-Hadamard and the discrete cosine transforms is obtained by simple post-processing of the 
Walsh-Hadamard transform coefficients. 

INTRODUCTION 

It is %iell known that the Karhunen-Loeve. or eigenvector transform (KLT) , provides decorre-^ 
lated vector roefficients with the maxlimim energy compaction, and that the discrete cosine 
transform \oCt) is a close approximation to the KLT for first-order Markov data («). We will 
show that the general class of even/odd transforms incluoes this particular KLT. as well as 
the DCT, the Walsh-Hadamard transform (WHT). and other familiar transforms. The more complex 
even/odd transforms can be computed by conbining a simpler even/odd tr^sform with a sparse 
matrix multiplication. A theoretical performance measure is computed for some even/odd trans- 
forms. and two image cospresslon experiments are reported. 

EVEN/0m> TRANSFORMS 

Orthogonal transforma are f'^equently used to compress correlated sampled data. Host coiamonly 
used transforms, including the Fourier, slant. DCT. and WHT have one-half even and one-half 
odd transform vectors. Several properties of such even/odd transforms are given in this sec- 
tion. The even vector coefficients are uncorrelated with the odd vector coefficients for a 
data correlation class %ihich Includes stationary data. The KLT is an even/odd transform for 
this class of data correlation* A conversion from one even/odd transform to another requires 
only multiplication by a sparse matrix, having one-half of its elements equal to aero. 

if N. the number of data poin -.an even number, a vector 

V • (vjvj • • * 

is said to be even if 

*1 ■ V«-l ‘ - 1. • • •. N/2 


and Is odd if 


•Fort tons of this research were performed under NA*JA con tract NAS2-9703 and NASA Interchange 
Agreement No. NrA?-OR-16V702. 
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For a data vector of length N 


the NHN correlation eatriK is given by 


ViV 


- E(nt^) 

Since is a aywtric •atrix» it can be partitioned into four N/2 x N/2 eubaacrices in 

the following Banner: 


E- 


A • A‘ and C - C‘ 

The general fon for a transform matrix with one-half even and one-half odd basis vectors 
(called an even/odd transform)* can be written as a partitioned matrix 


where B and D are N/2 X N/2 orthogonal matrices* and £ and 5 are formed by reversing 
the order of the coltans in E and D» chat is* 


where the permutation matrix t is Che opposite diagonal identity matrix. The matrix H 
can then be factored into the product of two matrices 

E o’ II 


It Is next shown that the even and odd vector coefficients of an even/odd transform are uncor- 
related* for a general class of data correlation matrices. The rorrelatlon matrix for the 
transformed data vector* Y • HX* is given by the simllsrlty transform. 


E - " E 

"y «x 


iE(A + Bl + IB^ ♦ ici)E^ E(A - BI + 1 b^ - ICi)D^[ 


|d(a * ti - iB^ - ici)E^ 


D(A - BI - 'iB^ + ici)D^i 


The even and odd vector coefficients are uncorrelated when the opposite diagonal submatrices 
are identically sero. This Is obviously true in the e-peclal case where 


A - let 


B - iB*^i 


These equations state that the data corr 'atlon matrix is synnetric about both the main 

diagonal and the opposite diagonal. This condition Is satisfied if K(x|Xj) Is a function of 
the magnitude of 1 * j* that Is* If the process Is stationary. For statlonarv data, the 
cot relation matrix is a svmmetrfc Toeplltz matrix (1, 2). 

This decorrelation property of even/odd transforms Is used to show chat the KLT Is an even/ 
odd transform. For K, a reordered matrix of the KLT vectors, the transformed vector Is 
2, 2 • KX, The correlation matrix for the transformed vector is given by 


E, • « E. «’ 
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Suppose that the data are first transformed by the matrix above, and that the date ere 
such that the even and odd coefficients are imcorrelated by this even/odd tranaform: 

- H E, 


;B(A + BI)E^ 


D(A * BI)0' 


! 0 Y,, 

Since H Is invertible, K ■ AH, for A “ KH^. 

E*-ahL,hV 

. A A 

0 Y 2 ! 


Suppose that 


A • 


[Al 

A2 

1^3 

Al, 


IAiYiAj + A2Y2A2 
UsYiAi^ + Ai.lf2A2^ 


AiYjAa^ + A2Y2Ai,^j 
AsYiAs"^ + AuYjAw^! 


Since the KLT produces N fully decorrelated coefficients, YLz * diagonal matrix* 

Either both A2 and A3, or both A^ and W must be identically zero* For A2 * A3 ■ 0, 
the first n/ 2 vectors remain even, while for A; • Ai, • 0, the even and odd vectors are 
interchanged in K. 

K - AH 

|Ai o’ 'e t \ 

m 

0 A 4 D «0‘ 

>iE Att; 

A14D ^Ai«D 
'AjE AiF.i ; 

The KLT is an even/odd transform for the class of correlation matrices for which even/odd 
transforms decorreiate the even and odd vector coefficients. 

If H and J are tuo cven/odd NXN transforms, the multiplication matrix for conversion 
between them is sparse: 

'fi t ’ 

H • 

0 -5 

F F ' 

J - 

G -C 
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H ■ SJ 


S • H.l'' 


'e t' 

* ' f 

C^l 

’ D -b 


1 T -:.T 

T ' tI 

lEF* ♦ EF 

EC - EC'* 

* ' T * ' T 

loF* - or 

Dc‘^+ 6c'' i 

sc’ • Bi(ci)'' • 

eHc"' - BC^ 



|bf^ 

0 { 

S • 2i 

1 0 

DC^i 


The conversion between any two even/od«; tranafonM requlrea V ? rather than V? 
ttuUlpllcationa* 

Me have shown that the claea of even/odd traneforaa has no correlation between the even and 
iHld vector coefficients* for a class of data correlation natricoa Including the stationary data 
quitrlK. The KLT for this data correlation class* and nany fanlllar tranaforms* are even/odd 
transforms. The coefficients of any even/odd transform can be obtained by a sparse matrix 
multiplication of the coefficients of any other even/odd transform. This observation tMs the 
has is of a previous Implementation of the HOT and suggested the Investigation of even/odd 
transforms described below. 

THE DISCRETE COS I ME TRANSFORM OBTAINED VIA THE HAOANARD TRANSFORM 

Hein and Ahmed have shown how the DOT vectors can be obtained by a sparse mstrlx multiplication 
on the WH^ vectors (.1, 4). Since the DCT» unlike the general KLT, has a constant vector and a 
shitted squa repave vector in coonon with the WHT* the number of matrix multiplications Is less 
than N*V«!. The A matrix, which gener ^^es the DCT vectors for N • B from the WHT vectors* Is 
given by Hein and Ahmed, and Is reprod, .ed here as Figure 1, Although this Implementation of 
the DCT require^ mor^' operations for targe N than the moat efficient OCT implementation (5). 
It la very saMn fact*.' , / for N equal to B. 

If a transform has even and odd vectors and has a constant vector* as is typical* It can be 
obtained via the WHT In the same tMy as the DCT. The slant transform Is an e«anple <1* 6). A 
harttware (mplementat Ion of the OCT via the WHT la being constructed at Amaa Reaaarch Canter* 
urtfng V • B and the matrix of Figure I. Since this Implamentaclon contains the ratrlx mul* 
c*pM cation factors In inexpensive read^onlv memories, tc will be possible to conelder the 
rea!>'tlme quantitation design and evaluation of a large class of transforms. Transforms with 
svhi>ptimuB performance are acceptable only If they can be implamenced with reduced complexity. 
Tranaform performance can be determined theoretically from the vector energy compaction, while 
*he Imp Vment at ton complexity can be enttmated from the number and type operations added 

.•*fe- t!»r WM^. 

COMPARTSON OF ‘'TRANSFORMS OS INC. THE FIRST-ORDER MARKOV CORRELATION MODEL 

It Is genera !lv ac>'epted chat the sample-to-aample correlation of an laisga line scan la approx* 
lasted bv the Mrat*order Markov model f7). 

c(Xj,x^) • c(! 1 • Jf ) • 

The correlation of adjacent samples, r, varies from 0.99 for low detail Images to 0.80 for 

high detail Im^iea, with an average of about 0.9S (8). lYie 'orrelatlon matrix* was gen- 

era tad using the fl rat-order Markov model, for various r* and the correapondlng RLT'a and 


90 



gnaMAL PNX II 
W won QUALITY 


Visi tor i»tiergieB were nueerically coo|mced. (The analytic aolutloa la known <9)*> In 

.idiilrion» the aatrlv was used to compote the tranafon vector energlea and correlations 

tor the WHTt UCT« and other tranafoms. 

.Vt is w«*U known. Che KhT vectors for r « 0.9S are very similar co Che OCT vectera and have 
losiriY identical vector energies (1. 1). The most apparent difference between the DCT and* the 
k:t tfl that the KLT vector correapo^lng to Che constant DCT vector le not exactly constant, 
hut tO'lKhts the central samples in a fixed transform block more than samples near the edge of 
the block. As r approaches 1.00. this KLT vector approaches the constant vector, and all 
the Kl.T vectors approach the corresponding DCT vectors. The vector energies of the KLT and 
the DCT are nearly identical for r greater than 0.90. and differ only slightly for r 
greater than 0.50. The KLT and DCT vector energies for N * 6 and r • 0.50 are plotted In 
Figure 2. The energy conpactlon at r • C.5 Is much less chan at the typical r ■ 0.95. 

Tlie rate-distortion performance of a transform depends on Che transform energy compaction, 
if the distortion d is less than the coefficient variance for all 1. all N trano** 

form vectors are quantized and transmitted. The number of bits required Is (10): 

N , 

b • 5 log2(0|^/d) 

i-1 

1 Jl M 

• 2 L I“l2 O4* “ 2 ** 

'1-1 12 

the first term of b can be used as a figure of merit for a transform. 

1 " 

f ■ 2 £ log? o * 

^ l«l ^ 

Tlie figure of merit f is a negative mxiber; Che larger Its magnitude, the greater Che rate 
reluct ion achieved by the transform. Table I gives f for Che KLT. OCT. UHT, and two even/ 
odd transforms the** will be described below. At correlation r ■ 0.95. the KLT gains 0*014 
bits more than * 'T and 1.183 bits more Chan the WHT. The WHT achieves most of the 

available data esslon. and the DCT achieves nearly all. As this rate reduction is 

obtained for a I vectors, the increased compression of the DCT over the WHT. for 

r ■ 0.95. is 1. f' B. or 0.15 bits per sample. 

KVHN/ODD TRANSFORns dBTAlHED VIA THE H. 1 LSH-HAOAMARD TRANSFORM 

'Phe sequenev of a transform vector Is defined as the number of sign changes in the vector. 

The vector seqrencles of the vectors corresponding to the matrix of Figure 1 are In bit- 
reverse order, as Indicated (0. 4. 2. 6. I. 5. 3. 7). The energy compaction of the WHT and 
i)TT for r • 0.95 and N ■ 8 is shown in Figure 3. In the conversion from WHT co DCT. the 
two-by-two matrix operation on vectors 2 and 6 transfers energy from 6 to 2. The four-by- 
four matrix operation on the vectors of sequency 1. 5. 3. and 7 reduces the energy of 3. 5. 
and 7 and Increases the energy of 1. These operations remove most of the residual correlation 
the WHT vectors. The matrix multiplication requires 20 multiplications by 10 different 
factors (15 factors Including sign differences). 

We first consider a slnpllfied operation on the 2 and 6 and the 1 and 3 sequency vectors. 

TMs operation consists of multiplying the WHT vectors by matrix B (Figure 4). This further 
transform is designed to reduce correlation and to generate new faiisfon tors In a way 
somewhat similar to the A matrix multiplication which produces the DCT. . «ra are two idenCi- 
«al cwo-by-two operations, and a total of eight mulClpllcatlons by two different factors 
f three Including sign). The energy coopaction of the B-matrlx transform is shown in Figure 3. 
with the energies of the WHT and DCT. As the B-matrlx transform vectors of sequency 0. 4. 5. 
and 7 are identical to the WHT vectors, they have identical energy. The B-matrlx transform 
vectors of sequencies 0. 1. 2. 3. 4. and 6 are identical to the corresponding DCT vectors 
(n. 4> ur very similar. For example, the B-matrlx vector of sequency 1 Is a slanted vector 
«»f step wlr?th 2 and step size 2 (3, 3. 1, 1, -1. -1, -3, -3). The performance of the 
R-matrls transform. In terms of the figure of merit, Is given In Table I above. The R-matrix 
transform has something more than one-half of the gain of the DCT over the WHT. with something 
less than one-hsl f of the multiplications, and less than one-fourth the hardware If the two- 
hv-twn transformer is used twice. 

As a second example, suppose that It la desired co approximate the OCT by adding integer prod- 
mts of the WHT vectors. For small integers, this operation can be implemented by digital 
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ahlf CB'*and*4ft<ld8, and raqulrea foMor al^ifiCAnC blca to bo rocalfiod* Tba 
KlRura 5ft Itf an orthonomial tranafons nattlx that la atnilar to the OCT* 


secrlic €• gteoa in 
The c<#o-by*tyo 


DMfttrtKft opuratlng on the vectors of eo<)uency 2 and la a spec tail tat ion of the gaoatal tMO^ 
hy-tuo natrlx having orthogonal roiM with Identical factora. The four-by-four operation on 


the vectors of odd seguencv is a spec lalltai Ion of the general four-by-four oatrlK with 
orthogonal rouSft identical factors, and the additional requireBent of a positive diagonal* 


The 8i*eclall2atlons of the general natrlcea were made by requiring that tha o#o-by-twa matria 
Integers have approximately the ratios found In the second (and third) rows of the A matrlXt 
and that the four-by-four natrlx integers have approxlomtely the ratios found in the fifth 
(and eighth) rows of the A matrix. Since the A-macrlx transform is the DCT. it is ansursd 
that the C tranafora vectors of sequency 2. 6ft 1ft and 7 will approximsce the cor responding 
DCT vectors. 


The energy compaction results of the C trsttsformft with the raaulca of tha WHT and DCT. are 
given In Figure 6« for r • 0*9S and N • 8* The energy of the vactera of aaquancy 2. 6» 
and 7 is very elmllar to the energy of the DCT vectorSft but ehe ueccera of oequancy 3 and 5 
are different* The energy correepondonce could be lnq>roved by aacchlag the four-b^four 
matrix fsctore to the average of the fifth and sixth rows In the A macrix» but therm la little 
potential data conpreosloa remaining* The theeretlcal performance of the C matrlXft in terms 
of the figure of merit* is given In Table 1. The C-natrlx transform obtains nearly all tha 
gain of the DCT over the WgT* If the rational form. Instead of the Integer form* of the 
C-matrlx transform were usedft the computation would require 16 multiplications by 4 different 
tactors (7 factors including sign differences). There Is some reduction in complexity from 
the implementation of matrix A. 


EXPERIMENTAL IMRGE COMPRESSION RESULTS 


Experimental results ware obtained for two>dlBmslongl» 8x8 mampla block iiqilammtatiima of 
the transforms considered above* Four video test ims^s — Harry Raasonarft two .*lrlSv too Kent 
and band — were used in all tests. These images have correlation of 0*97 to 0*98 between ele- 
ments in Che scan llnOft and fic the first order Markov modelft except for tha very detailed 
band ImagSft which deviates from Che Markov model and hae an average in-line correlation of 
0.85 (It). TVo different coiq>resslon experipencs were made. 

The test Imagee were first compressed by eprestficlng either thirty-two or sixteen of the 
eixty-four 8X8 transform vectors* 4r*lng an olght-'jlt unlfonst full-range quantiser* The 
other vectors were neglected. |,atcerna of the vectors transmitted and neglected are 

given in Figure 7. The vectors a e 'o seqjv*ncy order* /ich rha lowest aequancy average vector 
In the upper left comer of Che ratcern. The aiean-squere error for this compression mathod 
and the four transforms Is given In lible U. The B-mscrlx transform error la Intecmedlste 
between the WHT and DCT errors* si»d the C-matrix error is very close to the DCT error. This 
la consistent with the Markov imJel energy compaction results sbove. 

To obtain the greatest transform conpresslon* the transmitted bits should be asslgnad to the 
vectors according to the equation given above* and the coefficient quantlsera should be 
designed for minimum er-or $iven the coefficient energy and amplitude dlstrlbutlone . The 
optimum theoretical bit aasigniiumts and quantizers depend on the particular transform used* 

The teat images* and most typical images* contain low^contrsstft hlgh-cor relation background 
areas, and edges where correlation Is low. The bit assignments and quantiser designs based 
on the Btatloiiary Ma kov model Ignore this nonstatlonarlty* and designs that consider low- 
contrast areas and edges give Improved mean-square error and subjective performance. Such 
improved designs have been devised for the UlfT (II) » and have bean tested with the DCT* 
B-tnatrlHft and C-matrix transforms. The trsnamlssion rate and mean-square error rejults ace 
given In Figure 8* for the test images compressed In the video field. Th« DCT gives Improved 
error perf o^-rnsnce * and the B and C matrix transforms are Intermediate* but the B and C 
matrix results are relatively poorer than those in Table II. The DCT gives more rate reduc- 
tion than rhe WHT — aSouL 0..? to 0.5 bits per sample. AS a two-dimensional treoaforr .as twice 
the gain o^ a ome^dlmenslonal transform (. 0 ), the theoretlcel gain of the DCT over th* UHT* 
for r • 0.95, should be twice the 0*'5 bits per sample of Table 1, or O.JO bits per sample. 

The lo%Mr error 3f the DCT* B-matrlx* and C-mstrlx transforms does indicate subjective 'mprove- 
ment In the compressed Imagsa* This subjective Improvement le larger at lower total bit 
rates, due to t^e relative Increase of larger* more noticeable errors at the low^ r races, and 
due to the moFf* ob tec r 1 unable * blocky nature of large WHT errors. The B and C ma*’rlx 
error" are a jbtec<i *ve 'y more similar to the DCT errors than to W»T ertora* because the higher 
i*nergy vectors aopruxlmate tha DCT vectors. 

ft is not surprising that a dealgn optimized for t*'a UHT gives good raeu' > ' ^he OCT and 
Mtmllar transforms. The transform compression introduces errors In thrr w» * not Crene- 
rnttiing vector coefficients, by using quantizers chat are too narrow* ard .clzatlon 
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errors v.chin tho q«iM€iser rsogM. tlis OCT» bscause of ice supat^or aoergy conpaocion^ 
reducae the first two aawrcaa of error. Aithoogh the quantisers uaad era nearly untfora» 
thev do Have aitfiller qtMMtt lest lea etepe for low cnefficient values* ae the tb^rd source of 
error is also reduced. Any coopraealoii dealgn will give better perforiMnce with the OCT* 
rroB the similarity in energy conpaction* a good design for the VMT sheitld be reasonably 
effective for the OCT. However* further performance gains &n be nede with the DCT and with 
g-natrlx and C*matrix cransforms* by optlm. 2 *.ng the cot,. ess Ion designs for the trsnsfom 
used. 


The er* r rtatlatlce shew that the lower neen-sqtisre error of the DCT Is due both to fewer 
large ern rs* which nearly alweye occur at edges* and to fewer email erfore* which occur In 
tAuZ areas and edges. His subjective appearance of the compr^ed inage conflras that the 
DCT produces both smoother low oontraat areas and less distorted edges. Since the lew ceo* 
trast areas have very high correlation* and since the edges — though noi oolse^llke ^ cen he 
approximated by a iow-correlation Harkov eodsl* the mean-square error ss« .bjfwclve raaulte 
agree with the theoretical result that the DCT is superior to the WUT fo' . ; veluee el 
rel^itlon (see Table I). 

CONCLUSION 

The Karhunen-t^eve trenefomi for date irlth etetlonery correlation* the discrete coelne trens* 
form, the w«;.sh-Hsdamerd traneform* and other familiar tranafor»‘s are evim/odd vector trene- 
forms tdiose coefficients can be obtained by sparse matrix multiplications ol ths coefficients 
of other even/odd transfoms. 01 the familiar transforms* the Ualsh-Hsdemsrd' transfotm is 
the »i«.plef*t to Implememt* but has the smallest compression gain. Using the tfelah-Hedsmerd 
transform followed by a sparse matrix multiplication allows implementation of any even/odd 
ansform. The discrete cosine tranafono has a difficult implementation* hut vmr^f closely 
approaches thv opclisuB performance for first-order Markov data. As the form of rne vectors 
is modified to approach that of the discrete cosinb vectors, the vector energy compactloii 
and the theoretical and experimental image conprectlon results approach tnose of the discrete 
coelne transform. The theoretical data compression reliably indicates the difference in 
experimental per for these transforms. About one-half of the performance difference 

between the Halah-Hadsmard and the Jiscrete cosine transforms can be achieved by simple pest 
processing of Che Walsh-Hadsmsrd coefficients. 
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Tran»tor«ik ai \ » an4 Correi«Citm«. 

Transt«>r» 

Correlattua, r KLT DCT l0fT B MCrtx C MCriK 

O.W *19.817 -19.77S *1«.4S9 -19.205 -19.59? 

0.9i -11.741 -n.729 -10.540 -I!. 204 -M.SSB 

0.90 -H. *79 -8.141 -7.111 -7. 875 -8-180 

O.HO -S.lfc? -5.097 -4.117 -4.711 -4.954 

0. W - *.40? - t.l28 -2.745 -1.056 -1.214 

0.50 -1.451 -!.3f*6 -1-116 -1.261 -1.111 • 

0.00 n.oo 


TABLE II The Heott-S^eore Error for the WT, OCT. B He trim ood C Hecrlm 
Traasfocos with e Suboev of lectors Becoioed. 


Heon-souoro error for 12 vectors recsloed 



Beesoner 

TWO Girls 

two Men 

Besd 

mtt 

0.556 

0.806 

X.694 

3.948 

8 setrlx 

0.500 

0.738 

1.561 

3.626 

C Matrix 

0.442 

0.666 

X.516 

3.310 

DCT 

0.446 

0.660 

1.515 

3.036 


( 


HBSo-square error for 16 vectors rotslnod 


Reasoner 1 Vk> Girls TWo Men B and 

\mr 1.619 2.206 4.801 12.122 
B oatrlx 1.507 2.093 4.55? 17.056 
C Matrix '.4?7 2.029 4.447 11.897 
DCT 1.410 2.01! 4.4(16 11.628 
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Figure 1 - The A Watrix Used to Obtein the DCT From the WHT. 
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